## IOWA STATE UNIVERSITY Digital Repository

Graduate Theses and Dissertations

Iowa State University Capstones, Theses and Dissertations

2017

# Performance enhancement techniques for operational amplifiers

Bin Huang Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the <u>Electrical and Electronics Commons</u>

#### **Recommended** Citation

Huang, Bin, "Performance enhancement techniques for operational amplifiers" (2017). *Graduate Theses and Dissertations*. 17210. https://lib.dr.iastate.edu/etd/17210

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact digirep@iastate.edu.



## Performance enhancement techniques for operational amplifiers

by

## **Bin Huang**

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

## DOCTOR OF PHILOSOPHY

Major: Electrical Engineering

Program of Study Committee: Degang Chen, Major Professor Randall Geiger Nathan Neihart Chris Chu Jiming Song

The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this dissertation. The Graduate College will ensure this dissertation is globally accessible and will not permit alterations after a degree is conferred.

Iowa State University

Ames, Iowa

2017

Copyright © Bin Huang, 2017. All rights reserved.



## DEDICATION

To my parents



## TABLE OF CONTENTS

| LIST OF FIGURES vi                                                                                                                                                                               |  |  |  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| LIST OF TABLESix                                                                                                                                                                                 |  |  |  |
| ACKNOWLEDGMENTS x                                                                                                                                                                                |  |  |  |
| ABSTRACTxi                                                                                                                                                                                       |  |  |  |
| CHAPTER 1. INTRODUCTION 1                                                                                                                                                                        |  |  |  |
| 1.1.       Background       1         1.2.       Dissertation Outline       1         1.3.       References       3         CHAPTER 2.       GAIN ENHANCEMENT FOR OPERATIONAL AMPLIFIERS       5 |  |  |  |
| 2.1 Introduction 5                                                                                                                                                                               |  |  |  |
| 2.1. Introduction                                                                                                                                                                                |  |  |  |
| 2.2.1 General approaches for op amp DC gain enhancement                                                                                                                                          |  |  |  |
| 2.2.2 A state-of-the-art gain enhancement method via gds cancellation                                                                                                                            |  |  |  |
| 2.3. Principles of Robust Gain Enhancement via Gds Cancellation                                                                                                                                  |  |  |  |
| 2.4. Concept of Proposed Gain Enhancement Method via Gds Cancellation                                                                                                                            |  |  |  |
| 2.5. A SDC-based Gain Enhancement Technique 10                                                                                                                                                   |  |  |  |
| 2.6. A FVA-based Gain Enhancement Technique14                                                                                                                                                    |  |  |  |
| 2.7. The SDC-based vs. the FVA-based Gain Enhancement Technique                                                                                                                                  |  |  |  |
| 2.8. A Current Mirror Input Op Amp with the FVA-based GE Technique [7] 18                                                                                                                        |  |  |  |
| 2.8.1 Operating Principles                                                                                                                                                                       |  |  |  |
| 2.8.2 Sizing Strategies for DC Gain Boost                                                                                                                                                        |  |  |  |
| 2.8.3 Stability of an Op Amp with a RHP Dominant Pole                                                                                                                                            |  |  |  |
| 2.8.4 Frequency Analysis                                                                                                                                                                         |  |  |  |
| 2.8.5 Noise Analysis                                                                                                                                                                             |  |  |  |
| 2.8.0 Simulation and Measurement Results                                                                                                                                                         |  |  |  |
| 2.9. A Folded Cascode Ampinier with the FVA-based GE Technique [0]                                                                                                                               |  |  |  |
| 2.10. Discussion                                                                                                                                                                                 |  |  |  |
| 2.11. Summary                                                                                                                                                                                    |  |  |  |
| CHAPTER 3 SLEW RATE ENHANCEMENT FOR OPERATIONAL                                                                                                                                                  |  |  |  |
| TRANSCONDUCTANCE AMPLIFIERS                                                                                                                                                                      |  |  |  |
| 3.1 Introduction 39                                                                                                                                                                              |  |  |  |
| 3.2. Literature Review                                                                                                                                                                           |  |  |  |
| 3.3. Desired Features of Slew Rate Enhancement Circuits                                                                                                                                          |  |  |  |



| 3.4. Proposed SRE Method via Excessive Transient Feedback                    | 43                     |
|------------------------------------------------------------------------------|------------------------|
| 3.4.1 Concept of the slew rate enhancement via excessive transient feedback  | . 43                   |
| 3.4.2 Selections of sensing and driving nodes for a SRE circuit              | . 43                   |
| 3.5. Design Example with the Proposed SRE Technique                          | 45                     |
| 3.5.1 Small signal analysis                                                  | 48                     |
| 3.5.2 Large signal analysis                                                  | 49                     |
| 3.6. Simulation Results                                                      | 50                     |
| 3.7. Summary                                                                 | 52                     |
| 3.8. References                                                              | . 53                   |
| CHAPTER 4. POWER EFFICIENCY ENHANCEMENT FOR OP AMPS DRIVING                  |                        |
| LARGE CAPACITIVE LOADS                                                       | 54                     |
| 4.1 Introduction                                                             | 54                     |
| 1.2 Literature Review                                                        | 54                     |
| 4.2. Enclature Review                                                        | 54                     |
| 4.2.1 State of the art methods                                               | 55                     |
| 4.2.2 State-of-unc-art methods                                               | 60                     |
| 4.5. Desired realities of Op Amp for Driving Large Capacitive Loads          | . 00                   |
| 4.4. Concept of the Proposed Power-Efficient Op Amp Design for Driving Large | 60                     |
| 4.5 Design Example                                                           | 62                     |
| 4.5.1 Design of the V V preamp stage                                         | 62                     |
| 4.5.2 Design of the entire on amp                                            | 02<br>60               |
| 4.5.2 Design of the entire op amp                                            | 09                     |
| 4.6.1 Typical corper simulation results                                      | 70                     |
| 4.6.2 Process corner variation simulation results                            | . / J<br>              |
| 4.6.2 Mismatch variation simulation results                                  | . 01<br>8/             |
| 4.6.7 Process corper plus mismatch variation simulation results              | . 0 <del>4</del><br>88 |
| 4.6.5 Post levout simulation results                                         | . 00                   |
| 4.0.5 Tost-layout simulation results                                         | 02                     |
| 4.7. Terrormance Comparison of This work with the Enerature                  | . 95                   |
| 4.0. Discussion                                                              | . 90                   |
| 1.10 References                                                              | 07                     |
| 4.10. References                                                             | . )                    |
| CHAPTER 5. CURRENT UTILIZATION EFFICIENCY ENHANCEMENT FOR                    |                        |
| FOLDED CASCODE AMPLIFIERS                                                    | . 99                   |
| 5.1. Introduction                                                            | 99                     |
| 5.2. Literature Review                                                       | 101                    |
| 5.2.1 General review                                                         | 101                    |
| 5.2.2 A state-of-the-art FCA design for CUE enhancement                      | 102                    |
| 5.3. Proposed FCA Output Stage Design for Low Noise, Offset and Power        | 104                    |
| 5.3.1 Desired features and conceptual design of a FCA output stage           | 104                    |
| 5.3.2 Proposed FCA core amplifier design                                     | 107                    |
| 5.3.3 Proposed FCA output stage design                                       | 111                    |
|                                                                              |                        |



| 5.4. Simulation Results for Proposed FCA vs. Conventional Fast FCA | 123   |
|--------------------------------------------------------------------|-------|
| 5.4.1 Typical corner simulation results                            | 123   |
| 5.4.2 Process corner and temperature variation simulation results  | 129   |
| 5.4.3 Mismatch variation simulation results                        | 134   |
| 5.4.4 Process corner plus mismatch variation simulation results    | 136   |
| 5.5. Performance Comparison of This Work with the literature       | 139   |
| 5.6. Discussion                                                    | 141   |
| 5.7. Summary                                                       | 143   |
| 5.8. References                                                    | 143   |
| CHADTED 6 COMBINED DEDEODMANCE ENHANCEMENT TECHNIOLIES             |       |
| EOD EOL DED CASCODE AMDI IEIEDS                                    | 145   |
| FOR FOLDED CASCODE AMIT LIFIERS                                    | 143   |
| 6.1. Schematic Design                                              | 145   |
| 6.2. Frequency Response Analysis                                   | 152   |
| 6.3. Noise Analysis                                                | 156   |
| 6.4. Offset Voltage Analysis                                       | 159   |
| 6.5. Simulation Results                                            | . 160 |
| 6.5.1 Typical corner simulation results                            | 161   |
| 6.5.2 Process corner and temperature variation simulation results  | 166   |
| 6.5.3 Process corner plus mismatch variation simulation results    | 171   |
| 6.6. Performance comparison to the literature                      | 174   |
| 6.7. Discussion                                                    | 176   |
| 6.8. Summary                                                       | 176   |
| 6.9. References                                                    | 176   |
| CHAPTER 7. CONCLUSION                                              | 177   |



## **LIST OF FIGURES**

| Figure 2.1 Gain enhancement via (a) cascoding transistors (b) cascading gain stages (c) regulated gain boost (d) conductance cancellation | 7  |
|-------------------------------------------------------------------------------------------------------------------------------------------|----|
| Figure 2.2: Yan's conductance cancellation method                                                                                         | 8  |
| Figure 2.3: Concept of the proposed gain enhancement method via gds cancellation                                                          | 0  |
| Figure 2.4: SDC-based gds cancellation a) negative gds generator b) small signal circuit of the circuit in (a) c) low gain amplifier AN   | 1  |
| Figure 2.5: FVA-based gds cancellation a) negative gds generator b) small signal circuit of the circuit in (a) c) low gain amplifier AN2  | 4  |
| Figure 2.6. Schematic of the designed op amps 1                                                                                           | 9  |
| Figure 2.7. A two-stage op amp with a RHP dominant pole in a negative feedback loop 2                                                     | 1  |
| Figure 2.8: Small signal block diagram of the proposed op amp 2                                                                           | 2  |
| Figure 2.9: Noise model of the proposed op amp                                                                                            | 4  |
| Figure 2.10: Layout and microphotograph of the fabricated proptotyp op amp                                                                | 5  |
| Figure 2.11: Simulated DC gain vs. (a) P.T. variation (b) supply voltage (c) OSW 2                                                        | 6  |
| Figure 2.12: Op amps DC gain measurement (a) schematic (b) lab setup (c) measured DC gain vs. OSW                                         | 8  |
| Figure 2.13: Op amps' DC gain under P.Mis variation (a) proposed op amp (b) conventional op amp (c) gain enhancement                      | 8  |
| Figure 2.14: Post-layout simulated AC responses of the prop. and conv. op amps 2                                                          | 9  |
| Figure 2.15: Measured transient performance of the proposed op amp                                                                        | 0  |
| Figure 2.16: Post-layout simulated noise performance of the two op amps                                                                   | 1  |
| Figure 2.17: A fully differential FCA with the FVA-based technique                                                                        | 3  |
| Figure 2.18: $g_{D1}$ and $g_{B1}$ under P.T variation a) gD1 b) gB1                                                                      | 4  |
| Figure 2.19: gD and gB under P.T variation a) gD b) gB                                                                                    | 4  |
| Figure 2.20: DC gain of the proposed and conventional op amp                                                                              | 5  |
| Figure 3.1: Conventional Class-A operation transconductance amplifier                                                                     | 0  |
| Figure 3.2: An OTA with the adaptive biasing circuit [3]                                                                                  | -1 |
| Figure 3.3: Concept of the proposed SRE method 4                                                                                          | .3 |
| Figure 3.4: Different types of SRE methods                                                                                                | .4 |
| Figure 3.5: Designed one-stage OTA with the proposed SRE method 4                                                                         | -6 |



| Figure 3.6: Small signal transient response of the three designed OTAs                                | 51        |
|-------------------------------------------------------------------------------------------------------|-----------|
| Figure 3.7: Step responses of the three OTAs (a) output voltages (b) tail currents                    | 51        |
| Figure 5.1: Schematic of a conventional folded cascode amplifier (FCA)                                | 99        |
| Figure 5.2: Rudy's FCA a) the FCA's schematic b) floating battery in the FCA                          | 102       |
| Figure 5.3: Desired features of a FCA's output stage                                                  | 105       |
| Figure 5.4: A conceptual design of a FCA output stage                                                 | 107       |
| Figure 5.5: A PMOS input FCA with differential-to-single-ended conversion on a) PMC side b) NMOS side | )S<br>108 |
| Figure 5.6: Frequency responses of the conventional fast and slow FCA                                 | 110       |
| Figure 5.7: Transient responses of the fast and slow FCA                                              |           |
| Figure 5.8: Schematic of the proposed FCA with a new turn-around stage                                | 112       |
| Figure 5.9: Small signal block diagram of the proposed FCA                                            | 115       |
| Figure 5.10: Poles and zeros distribution of the proposed FCA                                         | 115       |
| Figure 5.11: Phase drop due to complex poles and zeros vs. k1 and k2.                                 |           |
| Figure 5.12: The proposed FCA's PM vs. k1 and k2                                                      |           |
| Figure 5.13: Noise model for the proposed FCA                                                         |           |
| Figure 5.14: Frequency responses of the proposed and conventional FCAs                                | 124       |
| Figure 5.15: Noise performance of the proposed and conventional FCAs                                  |           |
| Figure 5.16: Transient responses of the proposed and conventional FCAs                                | 126       |
| Figure 5 17: Frequency responses of the two FCAs a) proposed b) conventional                          |           |
| Figure 5.18: Noise performance of the prop and conv. FCAs under P.T. variation                        | 131       |
| Figure 5 19: Transient responses of the prop. and conv. FCAs under P.T. variation                     | 132       |
| Figure 5.20: Transient responses of the prop. and conv. FCAs under mismatch variation                 | 135       |
| Figure 5.21: Transient responses of the prop. and conv. FCAs under P.Mis. variation                   | 137       |
| Figure 5.22: Average Ts 0.01% of the proposed FCA under P Mis variation                               | 138       |
| Figure 5.22: Average Ts_0.01% of the conventional ECA under P Mis_variation                           | 138       |
| Figure 5.24: A circuit to reduce leakage current of M14 in the turn-around stage                      | 142       |
| Figure 6.1: Schematic of the proposed ECA with gain slew rate and CUE enhancement                     | 1/6       |
| Figure 6.2: Schematice of the negative SPE circuit for the proposed ECA                               | 140       |
| Figure 6.2: Schematice of the negative SKE circuit for the proposed FCA                               | 140       |
| Figure 6.4: Distribution of the proposed ECA's poles and zeros                                        | 132       |
| Figure 0.4. Distribution of the proposed FCA's poles and zeros                                        | 133       |
| Figure 0.5: Phase drop due to complex poles and zeros vs. K1 and K2                                   | 130       |



| Figure 6.6: The FCA's PM vs. k1 and k2 1                                              | 156 |
|---------------------------------------------------------------------------------------|-----|
| Figure 6.7: Noise model for the proposed op amp                                       | 157 |
| Figure 6.8: Frequency responses of the proposed and conventional FCAs                 | 162 |
| Figure 6.9: Noise performance of the proposed and conventional FCAs                   | 163 |
| Figure 6.10: Transient responses of proposed and conventional FCAs                    | 164 |
| Figure 6.11: Frequency responses of the two FCAs a) proposed b) conventional          | 167 |
| Figure 6.12: Noise performance of the prop. and conv. FCAs under P.T. variation       | 168 |
| Figure 6.13: Transient responses of the prop. and conv. FCAs under P.T. variation     | 170 |
| Figure 6.14: Transient responses of the prop. and conv. FCAs under P.Mis. variation 1 | 171 |
| Figure 6.15: Average Ts_0.01% of the proposed FCA under P.Mis. variation              | 172 |
| Figure 6.16: Average Ts_0.01% of the conventional FCA under P.Mis. variation          | 172 |



## **LIST OF TABLES**

| Table 2.1: Expression of the conductance and capacitance in the proposed op amp                                    | . 23 |
|--------------------------------------------------------------------------------------------------------------------|------|
| Table 2.2: Sumamry of measured performance of the two op amps                                                      | . 32 |
| Table 2.3: Performance Comparison to the literature                                                                | . 32 |
| Table 2.4: Performance summary of the designed op amps                                                             | . 36 |
| Table 3.1: Performance summary of the three designed OTAs                                                          | . 52 |
| Table 4.1: Expressions of parasitic capacitance for the op amp's input stage                                       | . 74 |
| Table 4.2: Performance summary of the designed op amp in the typical corner                                        | . 81 |
| Table 4.3: Process corner setups for the simulations of the designed op amp                                        | . 82 |
| Table 4.4: Performance sumamry of the designed op amp under process corner variation                               | . 84 |
| Table 4.5: Performance summary of the designed op amp under mismatch variation                                     | . 87 |
| Table 4.6: Performance summary of the designed op amp under P.Mis variation                                        | . 91 |
| Table 4.7: Performance comparison of this work in schematic and post-layout view with recently reported amplifiers | . 95 |
| Table 4.8: Performance comparison of this work with recently reported amplifiers                                   | . 95 |
| Table 5.1: Performacne summary of the designed conventional slow and fast FCAs                                     | 111  |
| Table 5.2: Expressions of the conductance and capactance in the proposed FCA                                       | 116  |
| Table 5.3: Performance summary of the proposed and conventional FCAs                                               | 129  |
| Table 5.4: Simulation setup with process corner and temperature variation                                          | 130  |
| Table 5.5: Performance summary of the prop. and conv. FCAs under P.T. variation                                    | 133  |
| Table 5.6: Performance summary of the prop. and conv. FCA under mismatch variation 7                               | 136  |
| Table 5.7: Performance summary of the prop. and conv. FCA under P.Mis variation                                    | 139  |
| Table 5.8: Performance comparison of the proposed FCA to the state-of-the-art method and the conventional FCA      | 140  |
| Table 6.1: Expressions of the conductance and capactance in the proposed FCA                                       | 153  |
| Table 6.2: Performance summary of the prop. and conv. FCAs in typical corner                                       | 165  |
| Table 6.3: Simulation setup with process corner and temperature variation                                          | 167  |
| Table 6.4: Performance summary of the prop. and conv. FCA under P.T. variation                                     | 170  |
| Table 6.5: Performance summary of the prop. and conv. FCA under P.Mis variation                                    | 174  |
| Table 6.7: Performance comparison of the proposed FCA to the literature                                            | 175  |



## ACKNOWLEDGMENTS

First and foremost, I would like to thank my parents for their encouragement, support and unselfish love. All the support and encouragement they have provided to me over years was my greatest gift. By the same token, I am thankful for the continuous support, trust and love of my wife, Manman Qian.

My deepest appreciation goes to my advisor, Dr. Degang Chen. His guidance and advice have been invaluable to my research as well as my career. He has made my Ph.D. study a wonderful and unforgettable journey.

I would also like to thank my committee members, in alphabetical order, Dr. Chris Chu, Dr. Randall Geiger, Dr. Nathan Neihart, and Dr. Jiming Song, for their advice and recommendations throughout my graduate program and for their service in my various examination committees.

I would also like to express my thanks to my peers including Yongjie Jiang, Chih-Wei Chen and Bharath Vasan. The technical exchange of ideas with them was especially helpful and therefore appreciated.



## ABSTRACT

Operational amplifiers (op amps) are one of the most fundamental and widely used building blocks for analog and mixed-signal circuits and systems. As transistors' feature size scales down in the deep submicron process, the short channel effects, high leakage current and reduced supply voltages make the design of op amps more challenging. In this dissertation, we present several methods to improve op amps' DC gain, slew rate, power efficiency and current utilization efficiency (CUE).

A basic requirement for an op amp is high DC gain especially for high precision applications. We introduce a method to robustly improve op amps' DC gain with negligible power and area overhead. The new DC gain enhancement method can be implemented based on the source degeneration circuit (SDC) or the flipped voltage attenuator (FVA). Compared to the FVA-based technique, the SDC-based technique is more suitable for those CMOS processes whose transistors' threshold voltages are too low for the transistors in the FVA to work in weak or strong inversion regions. Otherwise, the FVA-based technique is recommended as this technique is more robust to devices' random mismatch. A prototype op amp with the FVA-based technique is designed and fabricated in the IBM130nm process. The measurement and simulation results of the prototype verify that the technique largely enhances an op amp's DC and is very robust over process, voltage and temperature variations.

Another important op amp requirement is high slew rate. In this regard, we introduce a method that greatly improves an op amp's slew rate while still preserving its small signal performance by a well-defined turn-on condition. The performance of the introduced method is discussed in comparison with an existing adaptive biasing method that was widely used to enhance slew rate. The introduced method excels in several aspects. First, unlike the adaptive



biasing method which degrades an op amp' linearity, the introduced method is able to enhance linearity. Second, the proposed method improves an op amp's slew rate by 2320% (vs. 780% by the adaptive method) with the power and area overhead of 2% and 1.2% (vs. 15% and 35% by the adaptive method). In addition, the proposed method improves the op amp's total harmonic distortion (THD) by 6dB but the adaptive method degrades the THD by 12dB.

The ability to drive large capacitive loads is becoming critical for op amps in emerging applications such as liquid crystal display drivers. In this regard, we introduce a power efficient design of op amps that can drive large capacitive loads. The proposed method decouples the large and small signal performance, eliminates current waste in the preamp stages' load circuits, and is not sensitive to devices' random mismatches. Compared to the state-of-the-art methods, our design prototype in a CMOS 180nm process shows largely improved small and large signal figure of merits, equivalent to largely improved power efficiency for given small and large signal performance specifications.

Folded cascode amplifier (FCA) is a commonly used architecture for designing op amps, but a significant portion of supply current is wasted in the cascode stage. This not only reduces the current utilization efficiency (CUE), defined as the ratio of an FCA's tail current to its total supply current, but also degrades the FCA's gain, noise and offset. In this regard, we introduce a method to dramatically reduce a FCA's cascode stage current without degrading the FCA's settling performance. Compared to the existing methods, the proposed method effectively improves not only the CUE but also the settling performance of op amps.

Lastly, a prototype FCA, with the proposed performance enhancement techniques of gain, slew rate and CUE, is designed to demonstrate the compatibility of these techniques.



## CHAPTER 1. INTRODUCTION 1.1.Background

Operational amplifiers (op amps) are one of the most fundamental and widely-used building blocks for analog and mixed-signal circuits and systems. Their applications are found in low to high speed systems with large to small capacitive loads such as filters, data converters, integrators, power management IC and communication transmitters and receivers [1-5].

The designs of analog circuits like op amps are becoming more challenging in submicron CMOS processes, mainly due to the short channel effects, high leakage current and reduced supply voltage. Short channel effects reduce a transistor's intrinsic gain, which results in more challenges to design an effective high gain op amp. High leakage current imposes an upper limit to the achievable impedance at a node, which consequently limits the DC gain of an op amp. A low supply voltage for an op amp limits the maximum achievable signal to noise ratio (SNR) and slew rate. In addition to design difficulties caused by the submicron CMOS process, achieving desired specifications for many op amps demand large power and area consumption such as low noise, large gain-bandwidth product (GBW), fast slew rate (SR), short settling time, large capacitive load driving capability and wide input common mode range.

This dissertation is concerned with op amp performance enhancement for DC gain, slew rate, power efficiency and current utilization efficiency (CUE). Several new techniques to improve an op amp's DC gain, slew rate, power efficiency and CUE are introduced and discussed in this dissertation.

#### **1.2.Dissertation Outline**

In Chapter 2, high precision applications demanding high DC gain operational amplifiers are presented first, followed by a literature review of state-of-the-art DC gain enhancement



(GE) techniques. Then the principles for robust GE techniques via conductance cancellation are introduced. After that, two proposed GE techniques, designed based on a source degeneration circuit (SDC) and a flipped voltage attenuator (FVA) respectively, are introduced, analyzed and compared. Finally, incorporating the FVA-based GE technique, several design examples are presented and discussed with the simulation and measurement results to confirm the robustness and effectiveness of the proposed GE method.

In Chapter 3, operation transconductance amplifiers (OTAs) applications whose setting time is restricted by op amps' slew rate are described. The literature on slew rate enhancement (SRE) for OTAs is reviewed. Then a new SRE circuit is introduced, which has the ability to preserve the OTAs' small signal performance, process well-defined turn-on voltage for SRE circuits, dramatically improve the OTA's slew rate and slightly improve the OTAs' linearity. Next, the introduced SRE circuit is analyzed in both small and large signal operations. A design example incorporating the SRE technique is presented with the simulation results to confirm the effectiveness of the proposed SRE method.

In Chapter 4, op amp applications to drive capacitive loads in the range of nF to uF are reviewed first, followed by a literature view of state-of-the-art op amp designs for these applications. Then, the desired features and conceptual design of the proposed power-efficient op amps for these applications are introduced. After that, a design example, incorporating the proposed power-efficient design method, is introduced and analyzed in detail. Comprehensive simulation results of the design example are presented at the end of the chapter. The results verify that the proposed design has indeed better small and large signal figure of merits compared with the state-of-the-art methods in both.



2

In Chapter 5, op amp applications that need wide or close-to-supply-rail input common mode range, low noise, low offset voltage, low power consumption and high gain are reviewed. For these applications, folded cascode amplifiers (FCAs) are natural structure selections. Then, two differential-to-single-ended conversion circuits for a conventional FCA are discussed and compared in terms of speed. Then a literature review on the design of FCA's cascode stage or turn-around stage is presented. Following that, a new turn-around stage for a FCA is introduced to dramatically reduce the current waste in the FCA so as to improve the FCA's current utilization efficiency (CUE). In the end, a design example is presented with detailed analysis and extensive simulation results so as to verify the CUE improvement and confirm that no long recovery time is brought by the proposed CUE enhancement technique.

In Chapter 6, an op amp, which integrates the GE, SRE and CUE enhancement techniques introduced in Chapter 2, 3 and 5 is designed. The simulation results of the design example are presented and discussed to confirm the compatibility of these proposed performance enhancement techniques.

## **1.3.References**

- S. Koziel, R. Schaumann and H. Xiao, "Analysis and optimization of noise in continuous-time OTA-C filters," IEEE *Transactions on Circuits and Systems I: Regular Papers*, vol. 52, no. 6, pp. 1086-1094, June 2005
- [2]. H. Ishii, K. Tanabe and T. Iida, "A 1.0 V 40mW 10b 100MS/s pipeline ADC in 90nm CMOS," *Proceedings of the IEEE 2005 Custom Integrated Circuits Conference*, 2005., San Jose, CA, 2005, pp. 395-398.
- [3]. J. Silva, U. Moon, J. Steensgaard and G. C. Temes, "Wideband low-distortion deltasigma ADC topology", *Electronics Letters*, vol. 37, no. 12, pp. 1-2, 2001



- [4]. Cheung Fai Lee and P. K. T. Mok, "A monolithic current-mode CMOS DC-DC converter with on-chip current-sensing technique," *IEEE Journal of Solid-State Circuits*, vol. 39, no. 1, pp. 3-14, Jan. 2004
- [5]. A. A. Abidi, "Direct-conversion radio transceivers for digital communications," *IEEE Journal of solid-state circuits*, vol. 30, no. 12, pp. 1399-1410, Dec 1995



## CHAPTER 2. GAIN ENHANCEMENT FOR OPERATIONAL AMPLIFIERS

#### **2.1.Introduction**

Operational amplifiers (op amps) are important fundamental analog building blocks for many analog and mixed signal systems. Realization of high gain op amps in standard digital CMOS is key to implementing a high precision system on a chip. However, as transistor feature sizes continuously scale down and supply voltage reduces, transistor's intrinsic gain becomes smaller typically in the range of 20-30dB in a deep submicron process. Cascoding two transistors in a stack can boost DC gain to 40~ 60dB, but it is still far from sufficient for high precision applications such as sigma-delta converters, switched capacitor circuits and optical sensor analog front end to perform at their best. An efficient DC gain enhancement (GE) method is needed.

#### **2.2.Literature Review**

#### 2.2.1 General approaches for op amp DC gain enhancement

In an effort to improve DC gain of op amps in submicron processes, four methods shown in Figure 2.1 have been reported [1-4] in the literature: a) cascoding multiple transistors in a stack; b) cascading multiple gain stages; c) gain-boosting [1] and d) conductance cancellation [2-4]. Method a) is simple but results into a loss in voltage headroom especially when there are many transistors in a stack. Method b) requires complex frequency compensation, thus seriously degrading an amplifier's frequency response and settling performance. Method c) usually introduces pole-zero doublets which harm an amplifier's settling performance, in particular for high accurate settling performance. Method d) is so far seldom used in commercial production because the yield of large DC gain enhancement over PVT (process, voltage and temperature)



variation and output voltage swing is low with the existing schemes of method d) such as [2-4]. The main difficulty lies in that the generated negative conductance by [2-4] does not track and cancel the positive counterpart when the op amps' PVT condition and output voltages change. As a result, without the aid of extensive tuning work, methods [2-4] can only provide large DC gain enhancement in a particular PVT condition and output voltage. Whenever the operating conditions or temperatures of the op amps [2-4] change, the op amps need to be calibrated again, because otherwise the methods [2-4] would fail to provide large DC gain enhancement and can even potentially reduce the DC gain of op amps. Consequently, the op amps [2-4] always need to be calibrated before their normal operation, making these op amps not suitable for continuous-time operation. To maintain functionality, off-chip high-gain low-offset comparators are used in [2-3] to function as manual tuning circuits. A micro-controller and a 16-bit DAC are used in [4] to function as automatic tuning circuits. However, due to the need for extensive operations of the complicated tuning circuits the cost, power consumption and area overhead of [2-4] are high.



Figure 2.1 Gain enhancement via (a) cascoding transistors (b) cascading gain stages (c) regulated gain boost (d) conductance cancellation





Figure 2.1 (continued)

#### 2.2.2 A state-of-the-art gain enhancement method via gds cancellation

A half circuit of Yan's method [3] is shown in Figure 2.2. Because the drain current of M5 is fixed and the bulk and source of M5 are connected, the gate voltage of M5 and M6 track the changes in source voltage of M5 and M6. Consequently, the equivalent conductance looked down from the source of M6 is (1-A)\*gds6, which becomes negative if A is larger than 1. The DC gain of the amplifier in Figure 2.2 can be derived as (2-1) assuming that the input impedance of gain block A is high.

$$A_{o} = -\frac{g_{m1}}{g_{ds1} + g_{ds2} + g_{ds4} + g_{ds8} + (1 - A)g_{ds6}}$$
(2-1)

As can be seen from (2-1), in order to achieve large DC gain enhancement, (1-A)\*gds6 needs to always approach and cancel  $g_{ds1} + g_{ds2} + g_{ds4} + g_{ds8}$ , which is a very challenging task due to two factors. First, the conductance variations of PMOS and NMOS are different at various process corners. For example, the conductance of PMOS and NMOS change in opposite directions when the process corner changes from TT (typical corner) to SF (NMOS slow PMOS fast) or from TT to FS (NMOS fast PMOS slow). Second, gds6, gds2 and gds4 vary in a direction different from that of gds1 and gds8 when output voltage changes. For



instance, as the output voltage increases, drain-source voltage of M1 and M8 increase while drain-source voltage of M2, M4 and M6 reduces, assuming that A is larger than 1. As a result, gds1 and gds8 decreases while gds6, gds2 and gds4 increases when output voltage increases; but the amount of increase in gds6 is different from that in gds2 and gds4 because of amplification A. Therefore, the generated negative conductance by this method cannot track the changes of the positive conductance under PVT variation and output voltage swing.



Figure 2.2: Yan's conductance cancellation method

## 2.3. Principles of Robust Gain Enhancement via Gds Cancellation

In order to robustly enhance amplifiers' DC gain via the conductance cancellation method, the gds cancellation should satisfy the following three requirements.

- Type matching: only NMOS can be used for NMOS gds cancellation and only PMOS can be used for PMOS gds cancellation.
- 2) Operation matching: the critical transistors in the gds cancellation circuits should have the same bias and operation conditions. It means they should share the same gate, source, drain and bulk voltages and current densities.



3) Layout matching: the critical transistors may have different multipliers but should have the same width and length. Those transistors should have common centroid layout so as to share the same temperature variation.

## 2.4. Concept of Proposed Gain Enhancement Method via Gds Cancellation

The concept of the proposed gain enhancement (GE) method is illustrated by Figure 2.3, in which the gds of the bottom NMOS transistor in an op amp's cascode stage will be cancelled. As shown in Figure 2.3, the sensing and control block senses signals Vs+ and Vs- from the cascode stage first, where the Vs+ and Vs- are functions of the bottom NMOS's gds. With the sensed signals, the block then generates  $V_{fb}$  to adjust the negative conductance, -gn, so that -gn becomes a function of the bottom NMOS's gds. The dependency of -gn on the bottom NMOS's gds makes -gn inherently track and cancel the bottom NMOS's gds over PVT variations. Similarly, the gds of the top PMOS transistor in the cascode stage can be robustly cancelled via this method by implementing the PMOS counterpart of the sensing and control block are independent, the output impedance of the cascode stage can be independently increased. When the gds of both top PMOS and bottom NMOS of the cascode stage are completely cancelled by the proposed method, the op amp's DC gain will be ideally infinite.

In regard to the implementations of the sensing and control block, two design approaches will be introduced in the following chapter sections. The first design approach is based on a flipped voltage attenuator (FVA). The second approach is based on a source degeneration circuit (SDC).





Figure 2.3: Concept of the proposed gain enhancement method via gds cancellation 2.5.A SDC-based Gain Enhancement Technique

Figure 2.4 shows a SDC-based gain enhancement circuit for an op amp. In Figure 2.4a), transistors M3 and M6 are respectively bottom and cascode NMOS transistors in the op amp's cascode stage. The level shifter  $A_{N1}$ , whose implementation is shown in Figure 2.4c), senses transistor M3's drain voltage ( $V_A$ ) and then shifts up  $V_A$  to voltage  $V_B$ . The voltage  $V_B$  is connected to the input of an SDC formed by transistors M4-M9. In the SDC, the current mirror M7-M8 has mirror ratio of 1:1. Transistors M3, M4 and M9 have the same unit size (same width and length) with different multipliers: m, 1, and 1, respectively. As M3, M4 and M9 are the same type of transistors and have the same current density,  $V_{b4}$ ,  $V_E$  and  $V_F$  will be equal in the DC operation. The gain block of -1 can be easily implemented in fully differential circuits.

The small signal circuit of SDC is displayed in Figure 2.4 b). After deriving KCL equations (2-2), (2-3) and (2-4) at nodes  $V_C$ ,  $V_D$  and  $V_E$  separately, the DC gain from  $V_B$  to  $V_C$  is calculated as (2-5), where  $\varepsilon_1$  is given in (2-6) and  $\eta_5$  is  $g_{mb5}/g_{m5} \cong 0.15$ .  $\varepsilon_1$  is approximately equal to  $\frac{g_{ds4}+g_{ds5}}{g_{m5}} \approx \frac{g_{ds5}}{g_{m5}}$  as cascode transistor M5 are usually sized with a much shorter length compared to M4. The magnitude of  $\varepsilon_1$  is in the order of 0.05 when a transistor's intrinsic gain is about 20.





Figure 2.4: SDC-based gds cancellation a) negative gds generator b) small signal circuit of the circuit in (a) c) low gain amplifier A<sub>N</sub>

$$g_{ds4}V_{C} - g_{m5}(V_{B} - V_{C}) + g_{ds5}(V_{C} - V_{D}) + g_{mb5}V_{C} = 0$$
(2-2)

$$V_{\rm D}(g_{\rm m7} + g_{\rm ds7}) + g_{\rm ds4}V_{\rm C} = 0$$
(2-3)

$$g_{m8}V_D + g_{ds8}V_E + g_{m9}V_E + g_{ds9}V_E = 0$$
(2-4)

$$\frac{V_{C}}{V_{B}} = \frac{g_{m5}(g_{m7} + g_{ds7})}{g_{ds4}g_{ds5} + (g_{m7} + g_{ds7})(g_{ds4} + g_{ds5} + g_{m5}(1 + \eta_{5}))} = \frac{1}{(1 + \eta_{5})(1 + \epsilon_{1})}$$
(2-5)

$$\varepsilon_{1} = \frac{g_{ds5}(g_{ds7} + g_{m7}) + g_{ds4}(g_{ds5} + g_{ds7} + g_{m7})}{g_{m5}(g_{ds7} + g_{m7})} \cong 0.1$$
(2-6)

$$A_{N1} = \frac{(g_{ds10} + g_{m10} + g_{mb10})(g_{ds11} + g_{ds12} + g_{m11})}{g_{ds11}g_{ds12} + (g_{ds10} + g_{m10})(g_{ds11} + g_{ds12} + g_{m11})} = (1 + \eta_{10})(1 + \varepsilon_2)$$
(2-7)

$$\varepsilon_2 = -\frac{g_{ds11}g_{ds12}g_{m10} + a}{b(g_{m10} + g_{mb10})} \cong -0.01$$
(2-8)

$$a = (g_{ds11}g_{ds12} + g_{ds10}(g_{ds11} + g_{ds12} + g_{m11}))g_{mb10}$$
  

$$b = g_{ds11}g_{ds12} + (g_{ds10} + g_{m10})(g_{ds11} + g_{ds12} + g_{m11})$$
(2-9)

$$\frac{V_{C}}{V_{A}} = \frac{V_{C}A_{N1}}{V_{B}} = \frac{(1+\eta_{10})(1+\epsilon_{2})}{(1+\eta_{5})(1+\epsilon_{1})} = \frac{1}{1+\epsilon_{3}} \cong 0.94$$
(2-10)



$$\frac{V_F}{V_A} = -\frac{(1+\eta_{10})(1+\epsilon_2)g_{ds4}g_{m5}g_{m8}}{(g_{ds8}+g_{ds9}+g_{m9})k}$$
(2-11)

$$k = g_{ds4}g_{ds5} + (g_{ds7} + g_{m7})(g_{ds5} + g_{ds4} + g_{m5} + g_{mb5})$$
(2-12)

$$\frac{V_{\rm F}}{V_{\rm A}} = \frac{-(1+\eta_{10})(1+\epsilon_2)\,g_{\rm ds4}g_{\rm m8}}{(1+\eta_5)(1+\epsilon_4)g_{\rm m7}g_{\rm m9}} = \frac{-(1+\epsilon_5)\,g_{\rm ds4}g_{\rm m8}}{g_{\rm m7}g_{\rm m9}}$$
(2-13)

$$A_{o} = -\frac{g_{m1}}{g_{ds1} + g_{ds2} + g_{ds4} + g_{ds8} + (1 - A)g_{ds6}}$$
(2-14)

$$\varepsilon_{4} = \frac{k(g_{ds8} + g_{ds9} + g_{m9})}{(g_{m5} + g_{mb5})g_{m7}g_{m9}} - 1 \approx 0.1; \ \varepsilon_{5} = \frac{(1 + \eta_{10})(1 + \varepsilon_{2})}{(1 + \eta_{5})(1 + \varepsilon_{4})} - 1 \approx -0.1$$
(2-15)

$$g_{N1} = -\frac{V_F}{V_A} g_{m3} = \frac{(1+\epsilon_5) g_{m8} g_{ds4} g_{m3}}{g_{m7} g_{m9}} = (1+\epsilon_5) \frac{g_{m8}}{g_{m7}} \frac{g_{ds4} g_{m3}}{g_{ds3} g_{m4}} \frac{g_{m4}}{g_{m9}} g_{ds3} = (1+\epsilon_5)(1+\epsilon_6)(1+\epsilon_7)(1+\epsilon_8)g_{ds3}$$
(2-16)

$$\epsilon_{6} = \frac{g_{m8}}{g_{m7}} - 1; \epsilon_{7} = \frac{g_{ds4}g_{m3}}{g_{ds3}g_{m4}} - 1; \epsilon_{8} = \frac{g_{m4}}{g_{m9}} - 1$$
(2-17)

$$g_{A} = g_{ds3} - g_{N1} + \frac{g_{ds12}g_{ds11}(1+\eta_{10})}{g_{m11} + g_{ds11} + g_{ds12}} \approx (\varepsilon_{5} + \varepsilon_{6} + \varepsilon_{7} + \varepsilon_{8}) g_{ds3}$$
(2-18)

$$F_1 = \frac{g_A}{g_{ds3}} \approx \varepsilon_5 + \varepsilon_6 + \varepsilon_7 + \varepsilon_8$$
(2-19)

$$\sigma_{F1}^2 \approx \sigma_{\varepsilon_5}^2 + \sigma_{\varepsilon_6}^2 + \sigma_{\varepsilon_7}^2 + \sigma_{\varepsilon_8}^2$$
(2-20)

In addition, the DC gain of the level shifter,  $A_{N1}$ , is found as (2-7), where  $\eta_{10}$  is  $g_{mb10}/g_{m10}$ .  $\varepsilon_2$  is shown in (2-8) and is in the order of -0.01 where a and b are expressed as (2-9). Therefore, the DC gain from  $V_A$  to  $V_C$  is found as (2-10), where  $\varepsilon_3$  is approximately in the same order as  $\varepsilon_1$ , that is 0.05. As can be seen from (2-10), the voltage  $V_C$  is approximately a replica voltage of  $V_A$ . As M3 and M4 are transistors of the same type with the same width and length, gate, source voltage, drain voltage and current density,  $g_{ds3}/g_{m3}$  and  $g_{ds4}/g_{m4}$  should have the same variation over process, supply voltage and electrical operating point variation if not considering mismatch effects. A Common centroid layout for M3 and M4 is recommended to maximize the matching between  $g_{ds3}/g_{m3}$  and  $g_{ds4}/g_{m4}$  over temperature variations. Once the gain from  $V_A$ 



to V<sub>C</sub> is calculated, the DC gain from V<sub>A</sub> to V<sub>F</sub> is derived as (2-11), where k is displayed in (2-12). With further simplification, equation (2-11) is simplified as (2-13), where  $\varepsilon_4$  and  $\varepsilon_5$  are given in (2-14) and (2-15) respectively. The magnitude of  $\varepsilon_4$  and  $\varepsilon_5$  are in the order of 0.1 and -0.1. Thus, the generated negative conductance,  $-g_{N1}$ , is derived as (2-16), where  $\varepsilon_6$ ,  $\varepsilon_7$  and  $\varepsilon_8$ , as shown in (2-17), respectively represent the mismatch between M7 and M8, mismatch between M3 and M4, and mismatch between M4 and M9. After combining the positive and generated negative conductance, equation (2-18) shows the net conductance,  $g_A$ , looked down from the source of M6. As can be seen from (2-18), the generated negative conductance is an intrinsic function of the positive conductance,  $g_{ds3}$ , which makes this proposed DC gain enhancement technique effective and robust over PVT variations.

The term of  $\frac{g_{ds12}g_{ds11}(1+\eta_{10})}{g_{m11}}$  in (2-18) can be easily designed at least 100 times smaller than  $g_{ds3}$  due to two facts. First, drain current of M12 should be designed to be much smaller than (10 times) that of M3 and thus  $g_{ds12}$  can be 10 times smaller than  $g_{ds3}$ . Second, the intrinsic gain of M11 can be designed in the neighborhood of 20. Thus, the variation of  $\frac{g_{ds12}g_{ds11}(1+\eta_{10})}{g_{m11}+g_{ds11}+g_{ds12}}$  is negligible compared with the change in  $g_{ds3}$  and  $g_{N1}$  when mismatch and PVT variations are in presence. The expected conductance reduction factor,  $F_1$ , from this conductance cancellation method can be derived as (2-19). As shown in (2-19),  $F_1$  is highly related to the matching between those critical transistor pairs in the current mirrors such as M7 and M8, M4 and M9, and M3 and M4. The variance of  $F_1$  can be roughly calculated as (2-20), which may not be rigorously correct since  $\varepsilon_5$ ,  $\varepsilon_6$ ,  $\varepsilon_7$  and  $\varepsilon_8$  are not pairwise independent. But Equation (2-20) can still offer insight for designing this type of negative impedance generators with source degeneration circuits.



#### **2.6.A FVA-based Gain Enhancement Technique**

Figure 2.5a) shows a flipped voltage attenuator (FVA)-based gain enhancement circuit via gds cancellation. Transistors M3 and M2 are respectively the bottom NMOS and cascode NMOS transistors in a cascode stack. The drain voltage of M3 is sensed by the low gain amplifier  $A_{N2}$ , the implementation of which is shown in Figure 2.5c). The output of amplifier  $A_{N2}$  is connected to the FVA, formed by M4~M7.



Figure 2.5: FVA-based gds cancellation a) negative gds generator b) small signal circuit of the circuit in (a) c) low gain amplifier  $A_{N2}$ 

The small signal circuit of the flipped voltage attenuator is displayed in Figure 2.5b). By writing KCL equations, as shown in (2-21), at nodes of  $V_F$  and  $V_C$ , the DC gain from  $V_B$  to  $V_C$  is derived as (2-22), where  $g_x$  is  $g_{ds6}g_{ds7}/(g_{ds6} + g_{ds7} + g_{m6})$  and  $\gamma_1$  is given by (2-23).  $\gamma_1$  is about 0.04 in this process. The DC gain of the low gain amplifier,  $A_{N2}$ , is shown in (2-24), where  $\gamma_2$ , expressed in (2-25), is about -0.01. After knowing  $A_{N2}$  ( $V_B/V_A$ ) and  $V_C/V_B$ , the DC gain from  $V_A$  to  $V_C$  can be found close to 1. This means that the voltage variation at drain of M3 is approximately the same as that at drain of M4. Therefore, when M3 and M4 are placed in a common centroid layout, the variations of gds3/gm3 and gds4/gm4 should track each other



over PVT and output voltage swing variations because M3 and M4 are the same type of transistors with identical width, length, gate, source, drain voltage, current density.

$$g_{m5}(V_B - V_C) - g_{mb5}V_C + g_{ds5}(V_F - V_C) + g_{ds7}V_F = 0$$

$$g_{ds7}V_F + g_{ds4}V_C + g_{m4}V_F = 0$$
(2-21)

$$\frac{V_{C}}{V_{B}} = \frac{g_{m5}(g_{m4} + g_{x})}{g_{ds4}(g_{ds5} + g_{x}) + (g_{m4} + g_{x})(g_{ds5} + g_{m5} + g_{m5b})} = \frac{1}{(1 + \eta_{5})(1 + \gamma_{1})}$$
(2-22)

$$\gamma_1 = \frac{g_{ds4}(g_{ds5} + g_x) + g_{ds5}(g_{m4} + g_x)}{(g_x + g_{m4})(g_{m5} + g_{mb5})} \cong 1/[(1 + \eta_5)A_{v5}] = 0.04$$
(2-23)

$$A_{N2} = \frac{(g_{ds8} + g_{m8} + g_{mb8})(g_{ds9} + g_{ds10} + g_{m9})}{g_{ds9}g_{ds10} + (g_{ds8} + g_{m8})(g_{ds9} + g_{ds10} + g_{m9})} = (1 + \eta_8)(1 + \gamma_2)$$
(2-24)

$$\gamma_{2} = -\frac{g_{ds9}g_{ds10}g_{m8} + c}{d(g_{m8} + g_{mb8})} \cong -\frac{\eta_{8}}{(1 + \eta_{8})A_{v8}} \cong -0.007$$
(2-25)

$$c = [g_{ds9}g_{ds10} + g_{ds8}(g_{ds9} + g_{ds10} + g_{m9})]g_{mb8}$$
  
$$d = g_{ds9}g_{ds10} + (g_{ds8} + g_{m8})(g_{ds11} + g_{ds10} + g_{m9})$$
  
(2-26)

The DC gain from V<sub>A</sub> to V<sub>F</sub> is calculated as (2-27), where h,  $\gamma_3$  and  $\gamma_4$  are given in (2-28), (2-29) and (2-30). The values of  $\gamma_3$  and  $\gamma_4$  are respectively close to 0.04 and -0.05. Utilizing the expression of V<sub>F</sub>/V<sub>A</sub>, the generated negative impedance,  $-g_{N2}$ , can be easily derived as (2-31), where the expression of  $\gamma_5$  is (2-32), representing the transistor intrinsic gain mismatch between M3 and M4. After combining  $-g_{N2}$  and the positive conductance, looked down from source of M2, the net impedance,  $g_A$ , is derived as (2-33). The equation can be simplified as  $g_{ds3}(\gamma_4 + \gamma_5)$  because  $\frac{g_{ds10}g_{ds9}(1+\eta_8)}{g_{m9}+g_{ds9}+g_{ds10}}$  is typically more than 100 times smaller than  $g_{ds3}$ . As  $\gamma_4$  and  $\gamma_5$  are much smaller than 1,  $g_A$  is a much smaller value than the original conductance. The conductance reduction factor brought by the technique is derived as (2-34). As can be seen, F<sub>2</sub> is highly related to the matching between M3 and M4. The variance of F<sub>2</sub> can be roughly derived as (2-35), which may not be rigorously correct since  $\gamma_4$  and  $\gamma_5$  are not



independent. However, this still provides design insights for the FVA-based gain enhancement technique.

Though the analysis above provides design insights (for example, improving the matching between M3 and M4 is meaningful) for achieving the best gain enhancement, a simulation should be conducted to examine the gain enhancement. If  $g_A$  is systematically positive in simulation, one can increase M8 size or decrease M5 size slightly to reduce the drain voltage of M4, which raises  $g_{ds4}/g_{m4}$  and  $\gamma_5$ . If  $g_A$  is systematically negative, one should reduce M8 size or increase M5 size slightly. A design example with this technique will be discussed at length in Section 2.8.

$$\frac{V_{F}}{V_{A}} = \frac{g_{ds4}g_{m5}A_{N2}}{-h} + g_{ds4}(g_{ds5} + g_{x}) = -\frac{g_{ds4}(1 + \eta_{8})(1 + \gamma_{2})}{g_{m4}(1 + \eta_{5})(1 + \gamma_{3})}$$

$$= -\frac{g_{ds4}(1 + \gamma_{4})}{g_{m4}}$$
(2-27)

$$h = (g_x + g_{m4})(g_{ds5} + g_{m5} + g_{mb5})$$
(2-28)

$$\gamma_3 = \frac{g_{ds4}g_{ds5} + g_{ds5}g_{m4} + g_x(g_{ds4} + g_{ds5} + g_{m5} + g_{mb5})}{g_{m4}(g_{m5} + g_{mb5})} \cong 0.04$$
(2-29)

$$\gamma_4 = \frac{(1+\eta_8)(1+\gamma_2)}{(1+\eta_5)(1+\gamma_3)} - 1 \cong \frac{\eta_8 - \eta_5 - \eta_8/A_{v8} - 1/A_{v5}}{1+\eta_5 + 1/A_{v5}} = -0.05$$
(2-30)

$$g_{N2} = \frac{V_F g_{m3}}{-V_A} = \frac{g_{m3} g_{ds4} (1 + \gamma_4)}{g_{m4}} = (1 + \gamma_4) (1 + \gamma_5) g_{ds3}$$
(2-31)

$$\gamma_5 = g_{m3}g_{ds4} / (g_{m4}g_{ds3}) - 1 \tag{2-32}$$

$$g_{A} = g_{ds3} - g_{N2} + \frac{g_{ds10}g_{ds9}(1+\eta_{8})}{g_{m9} + g_{ds9} + g_{ds10}} \approx g_{ds3}(\gamma_{4} + \gamma_{5})$$
(2-33)

$$F_2 = g_A/g_{ds3} \approx \gamma_4 + \gamma_5 \tag{2-34}$$

$$\sigma_{F_2}^2 \approx \sigma_{\gamma_4}^2 + \sigma_{\gamma_5}^2 \tag{2-35}$$



#### 2.7. The SDC-based vs. the FVA-based Gain Enhancement Technique

The discussions in Sections 2.5 and 2.6 together show that both the SDC-based and FVAbased gain enhancement techniques via conductance cancellation obey the rules of robust gain enhancement via conductance cancellation. In this section, the two techniques are compared.

Compared to the FVA, the SDC-based technique is more suitable for CMOS processes, in which transistors' threshold voltages are too low for the transistors to work in weak or strong inversion regions with the FVA configuration. Otherwise, the FVA-based technique is recommended due to the following advantages. First, the FVA-based technique is simpler, more compact and more power efficient due to the involvement of fewer transistors and branches of circuits. Second, the FVA-based technique has fewer high frequency poles in the gain enhancement signal path. Third, the FVA-based technique is very suitable for both fully differential and single ended op amps, whereas the SDC-based technique needs an additional gain block of -1 for single ended op amp. The last but not least, the FVA-based technique is more robust in response to devices' random mismatches, simply because the variance of its gds reduction factor, shown in equation (2-35), is smaller than the SDC-based technique in equation (2-20).

In the following designed prototype op amps in IBM130nm CMOS process, the FVA-based gain enhancement technique is implemented in favor of design simplicity, low power and area consumption.



## 2.8.A Current Mirror Input Op Amp with the FVA-based GE Technique [7]

18

## 2.8.1 Operating Principles

A current mirror input op amp with the proposed FVA-based gain enhancement technique is shown in Figure 2.6. The op amp core consists of a current mirror input stage, a cascode stage, and a push-pull output stage. By reusing the wide swing cascode current mirrors in the op amp core, only six transistors (M7~M9 and M22~M24) are needed for implementing the proposed gds cancellation circuits for both NMOS and PMOS sides. Similar to the previous DC analysis of the FVA-based technique, the equivalent conductance looked down from the source of M13 and looked up from the source of M15 can be found as (2-36). The expressions of  $\delta_1$ ,  $\delta_2$ ,  $\delta_3$ , and  $\delta_4$  are given in (2-37) to (2-40).

$$g_A \approx g_{ds11}(\delta_1 + \delta_2); \ g_B \approx g_{ds17}(\delta_3 + \delta_4)$$
(2-36)

$$\delta_1 \cong \frac{\eta_7 - \eta_3 - \eta_7 / A_{v7} - \frac{1}{A_{v3}}}{1 + \eta_3 + \frac{1}{A_{v3}}} \cong -0.05$$
(2-37)

$$\delta_2 = (g_{m11}g_{ds5}) / (g_{m5}g_{ds11}) - 1 = (A_{v11} - A_{v5}) / A_{v5}$$
(2-38)

$$\delta_3 \simeq \frac{\eta_{22} - \eta_{14} - \eta_{22} / A_{v7} - \frac{1}{A_{v14}}}{1 + \eta_{14} + 1 / A_{v14}} \simeq -0.08 ; \qquad (2-39)$$

$$\delta_4 = (g_{m17}g_{ds16}) / (g_{m16}g_{ds17}) - 1 = (A_{v17} - A_{v16}) / A_{v16}$$
(2-40)





Figure 2.6. Schematic of the designed op amps

#### 2.8.2 Sizing Strategies for DC Gain Boost

In terms of sizing, a good start point is to set the sizes of the transistors (M7 and M22) in the NMOS and PMOS gds cancellation circuits as small copies of the corresponding transistors in the op amp core. For example, M7 and M22 have the same width and length as M3 and M14 but with fewer multipliers. M8~M9 and M23~M24 are just cascode current sources. With this start point, the conductance,  $g_A$  and  $g_B$  should be close to zero. The second step is to simulate the  $g_A$  and  $g_B$  and check if they are systematically positive or negative by typical corner simulation. If  $g_A$  is systematically above zero, one should increase the size of M7 or decrease the size of M3 slightly to reduce the drain voltage of M5 so as to raise  $g_{ds5}/g_{m5}$  and reduce  $g_A$ . On the contrary, if  $g_A$  is systematically below zero, the opposite sizing strategy should be used. The sizing strategy can be applied to find the optimal  $g_B$  as well. The third step is to determine the transistor's mismatch via Monte Carlo simulation. For example, according to (2-35), the mismatch between M5 and M11 and the mismatch between M16 and M17 should be within



10% to obtain a DC gain enhancement of 20dB. The last step is to simulate the robustness of the DC gain enhanced by the technique over PVT variations and to fine tune the transistor sizes accordingly.

#### **2.8.3** Stability of an Op Amp with a RHP Dominant Pole

Under PVT variations, the generated negative conductance from the proposed method can be larger or smaller than the positive conductance to be cancelled. When the generated negative conductance is larger than the positive conductance, the output impedance of the first stage becomes negative and the dominant pole of the open loop op amp turns into a right half plane [RHP] pole. To understand a RHP pole's impact on an op amp's stability, a generic two-stage op amp placed in the negative feedback shown in Figure 2.7 will be discussed. The open loop transfer function of the two-stage op amp is given as (2-41), where GBW, P1, and P2 are respectively the op amp's gain bandwidth product, dominant pole, and secondary nondominate pole. The closed loop transfer function of the configuration in Figure 2.13, H(s), is calculated as (2-42), where  $\beta$  is the feedback factor. According to (2-42), as long as  $\beta$  is larger than the reciprocal of the op amp's DC gain (P<sub>1</sub>/GBW), the closed loop system in Figure 2.7 is stable.  $\beta$  is almost always larger than P<sub>1</sub>/GBW in most practical applications. Therefore, the possible RHP dominant pole incurred by an overcompensation of the positive conductance should not change an op amp's stability in most of the closed loop applications anyway.

$$A(s) = (P_2GBW) / [(s - P_1)(s + P_2)]$$
(2-41)

$$H(s) = \frac{A(s)}{1 + A(s)\beta} = \frac{P_2 GBW}{s^2 + (P_2 - P_1)s + P_2(\beta GBW - P_1)}$$
(2-42)





Figure 2.7. A two-stage op amp with a RHP dominant pole in a negative feedback loop **2.8.4 Frequency Analysis** 

In order to understand the effects of the proposed conductance cancellation circuit on the entire op amp's frequency response, the small signal block diagram of the proposed op amp, shown in Figure 2.8, is used for frequency analysis. Nodes  $(1)\sim(5)$  and (9) in Figure 2.6 are of very low impedance. The poles associated with these nodes are close to or fractions of the transistors' unity current gain frequencies (f<sub>T</sub>) and are significantly larger than the unity gain frequency (UGF) of the designed op amp. Since the frequency of interest (possible stability and slow time constant concerns) is below the UGF, the poles associated with nodes  $(1)\sim(5)$  and (9) will be neglected in the following calculations for simplicity. Similarly, the poles and zeros close to transistors' f<sub>T</sub> in u1(s) and u2(s) will also be neglected. The full expressions of u1(s) and u2(s) are derived as (2-43) and (2-44), in which Zx, Px, Zy, and Py are shown in (2-45).

$$u_{1}(s) \approx \frac{\left(1 + s\frac{C_{gs22}}{g_{m22}}\right)}{1 + s\frac{C_{gs22} + C_{gs14}}{g_{m22}}} \frac{g_{ds16}\left(1 + \frac{sC_{gd14}}{g_{ds16}}\right)\left(1 - \frac{sC_{gs16}}{g_{m16}}\right)}{\left(1 + \frac{sC_{gs16}}{g_{m16}}\right)} \approx \frac{g_{ds16}\left(1 + \frac{s}{Z_{x}}\right)}{1 + \frac{s}{P_{x}}}$$
(2-43)

$$u_{2}(s) \approx \frac{\left(1 + s\frac{C_{gs7}}{g_{m7}}\right)}{1 + s\frac{C_{gs7} + C_{gs3}}{g_{m7}}} \frac{g_{ds5}(1 + \frac{sC_{gd3}}{g_{ds5}})(1 - \frac{sC_{gs5}}{g_{m5}})}{(1 + \frac{sC_{gs5}}{g_{m5}})} \approx \frac{g_{ds5}\left(1 + \frac{s}{Z_{y}}\right)}{1 + \frac{s}{P_{y}}}$$
(2-44)





22

Figure 2.8: Small signal block diagram of the proposed op amp

In order to find the poles and zeros created by the gain enhancement method, KCL equations at nodes  $(2)\sim (6)$  and (10) are written as  $(2-46) \sim (2-51)$ , where gi and Ci,  $i \in [1, 2, ..., 6]$ , are shown in Table 2.1. To obtain design insights from the transfer function from the op amp input to output (V<sub>out</sub>/V<sub>id</sub>), three assumptions are made to simplify the transfer function without losing accuracy during the derivation. The three assumptions are:

- 1) The transconductance of transistors M1~M6 and M10~M21 are much larger than their conductance.
- 2)  $C_L >> C_C >> C_3, C_5$ .
- 3) The current mirror ratio of M6 to M10 is 1.

$$-0.5V_{id}g_{m1}g_{m10}/g_{m6} + V_2(g_2 + sC_2) + V_3u_1(s) = 0$$
(2-46)

$$g_{m17}V_2 + V_3(g_3 + sC_3) + g_{ds15}(V_3 - V_{o1}) + g_{m15}V_3 = 0$$
(2-47)

$$-0.5g_{m1}V_{id} + V_4 (g_4 + sC_4) + V_5 u_2(s) = 0$$
(2-48)

$$g_{m10}V_4 + V_5(g_5 + sC_5 + g_{ds13} + g_{m13}) - g_{ds13}V_{o1} = 0$$
(2-49)



$$g_{ds13}(V_5 - V_{o1}) + g_{m13}V_5 + g_{ds15}(V_3 - V_{o1}) + g_{m15}V_3$$
  
=  $V_{o1}sC_6 + \frac{(V_{o1} - V_{out})sC_c}{(1 + sR_cC_c)}$  (2-50)

$$(V_{o1} - V_{out})sC_c/(1 + sR_cC_c) + V_2g_{m18}g_{m20}/g_{m19} = V_{o1}g_{m21} + V_{out}(g_L + sC_L)$$
 (2-51)

$$\frac{V_{out}}{V_{id}} \approx \frac{\frac{g_{m1}}{P_1 C_c} \left(1 + \frac{sC_c g_{meff}}{2g_{m16} g_{m21}}\right) \left(1 + \frac{sg_{m21} tR_c C_c}{g_{meff}}\right) \left(1 + s\frac{P_x + P_y}{P_x P_y}\right) \left(1 + \frac{s}{P_x + P_y}\right)}{\left(1 + \frac{s}{P_1}\right) \left(1 + \frac{sC_L}{g_{m21}}\right) (1 + sR_c C_c) \left(1 + s\frac{P_x + P_y}{P_x P_y}\right) \left(1 + \frac{s}{P_x + P_y}\right)}$$

$$t = \frac{g_{m18} g_{m20}}{g_{m19} g_{m21}}; \quad g_{meff} = g_{m21} t + 2g_{m16} (g_{m21} R_c - 1)$$
(2-52)

$$P_{1} = g_{L} \left(\frac{g_{A}g_{ds13}}{g_{m13}} + \frac{g_{B}g_{ds15}}{g_{m15}}\right) / (g_{m21}Cc)$$
(2-53)

With the three assumptions above, the transfer function  $V_{out}/V_{id}$  is derived as (2-52), where P<sub>1</sub>, t, and g<sub>meff</sub> are given in (2-53). Expressions of g<sub>A</sub> and g<sub>B</sub> are the same as the equations shown in (2-36). Expression (2-52) shows that the high frequency poles (P<sub>x</sub> and P<sub>y</sub>) associated with  $\mu_1(s)$  and  $\mu_2(s)$  form two compressed pole-zero pairs at frequencies around  $(P_xP_y)/(P_x+P_y)$  and  $P_x+P_y$ . Fortunately, P<sub>x</sub> and P<sub>y</sub> are a fraction of transistor's f<sub>T</sub> so they are inherently high frequency poles. As long as both P<sub>x</sub> and P<sub>y</sub> are at frequencies several times higher than the UGF of the op amp shown in Figure 2.6, the FVA-based gain enhancement technique should not change an op amp's high frequency response.

| $g_1 \approx g_{m6} + g_{ds2}$    | $C_1 \approx C_{gs6} + C_{gs10} + C_{gd2} + C_{gd4}$               |
|-----------------------------------|--------------------------------------------------------------------|
| $g_2 \approx g_{m16}$             | $C_2 \approx C_{gs16} + C_{gs17} + C_{gs18} + C_{gd14} + C_{gd12}$ |
| $g_3 \approx g_{ds17}$            | $C_3 \approx C_{gs15} + C_{gs22} + C_{gd17}$                       |
| $g_4 \approx g_{m5} + g_{ds1}$    | $C_4 \approx C_{gs5} + C_{gs11} + C_{gd1} + C_{gd3}$               |
| $g_5 \approx g_{ds11}$            | $C_5 \approx C_{gs13} + C_{gs7} + C_{gd11}$                        |
| $g_L \approx g_{ds21} + g_{ds20}$ | $C_6 \approx C_{gs21} + C_{gd15} + C_{gd13}$                       |
| g <sub>9</sub> ≈g <sub>m9</sub>   | $C_{9} \approx C_{gs20} + C_{gs19} + C_{gd18}$                     |

Table 2.1: Expression of the conductance and capacitance in the proposed op amp


#### 2.8.5 Noise Analysis

The noise model of the proposed op amp is shown in Figure 2.9. The modeled voltage noise includes flicker and thermal noise of the transistors. As the first stage of the proposed op amp has a high gain, the input referred noise contributed from the output stage is negligible. Noise contribution from cascode transistors will also be neglected because it is much smaller than the noise contribution from the input pair and current sources.



Figure 2.9: Noise model of the proposed op amp

Therefore, the input referred noise power of the proposed op amp is approximately calculated as (2-54). In (2-54), the noise terms of  $(4g_{m10}^2e_{n10}^2 + 2g_{m17}^2e_{n17}^2 + 2g_{m1}^2e_{n1}^2)/g_{m1}^2$  and  $(g_{m24}^2e_{n24}^2 + g_{m9}^2e_{n9}^2)/g_{m1}^2$  are respectively contributed from the op amp core and the gds cancellation circuits. The noise contribution ratio of M24 to M10 can be found as (2-55), where  $\alpha$  is the size ratio of M24 to M10. As the size ratio of M9 to M16 is the same as that of M24 to M10, it can be found that the noise contribution ratio of M9 to M16 is also  $\alpha$ . Therefore, the input referred noise power of the proposed op amp can be simplified as (2-56). As  $\alpha$  is set as 1/6 in this design, the extra noise contribution from the conductance cancellation circuit is a very small portion of the total noise.



$$e_{eq,prop}^{2} \approx \frac{4g_{m10}^{2}e_{n10}^{2} + g_{m24}^{2}e_{n24}^{2} + 2g_{m17}^{2}e_{n17}^{2} + g_{m9}^{2}e_{n9}^{2} + 2g_{m1}^{2}e_{n1}^{2}}{g_{m1}^{2}}$$
(2-54)

$$\frac{g_{m24}^2 e_{n24}^2}{g_{m10}^2 e_{n10}^2} = \frac{g_{m24}^2 \left(\frac{\delta KI}{3g_{m24}} + \frac{KF_{flicker}}{W_{24}L_{24}C_{0x}f}\right)\Delta f}{g_{m10}^2 \left(\frac{\delta kT}{3g_{m10}} + \frac{KF_{flicker}}{W_{10}L_{24}C_{0x}f}\right)\Delta f} = \frac{g_{m24}}{g_{m10}} = \alpha$$
(2-55)

$$e_{eq,prop}^{2} \approx \frac{(4+\beta)g_{m10}^{2}e_{n10}^{2} + (2+\beta)g_{m17}^{2}e_{n17}^{2} + 2g_{m1}^{2}e_{n1}^{2}}{g_{m1}^{2}}$$
(2-56)

## 2.8.6 Simulation and Measurement Results



Figure 2.10: Layout and microphotograph of the fabricated proptotyp op amp In the designed prototype op amp as shown in Figure 2.6, the feedback paths from the gate of M7 to the gate of M3 and from the gate of M22 to the gate of M14 are made controllable by a switch so as to compare the DC gain of the op amp in two conditions: one with the gds cancellation circuit enabled (proposed) and one with the circuit disabled (conventional). The microphotographs and layouts of the two op amps are combined and shown in Figure 2.10. Post-layout simulation and measurement results of the two op amps are compared under various process corners, temperatures, supply voltage, and OSW. The comparison shows the effectiveness of the proposed method under of PVT, wide temperature and OSW variations. The Monte Carol simulation with 200 runs confirms the proposed op amp's ability to provide large DC gain enhancement over random mismatch.





# 2.8.6.1 Simulated Open Loop DC Gain vs. PVT Variation

Figure 2.11: Simulated DC gain vs. (a) P.T. variation (b) supply voltage (c) OSW The two op amps are placed in a unity gain buffer structure without any resistive load in the following post-layout DC gain simulation to show the open loop DC gain of the op amps. Figure 2.11(a), (b) and (c) respectively show the dependency of the proposed and conventional op amps' loop DC gain over PVT variations and OSW. The dashed and solid lines respectively correspond to the proposed and conventional op amps' performance. Figure 2.11(a) shows that the DC gain of the proposed and conventional op amps respectively ranges from 110.4dB to 139.8dB and from 87.2dB to 90.1dB. The gap between the solid and dashed lines represents



the amount of DC gain boost solely brought out by the proposed technique. Across all process corners and temperatures ranging from -40°C to 80°C, the minimum DC gain boost yielded by the proposed technique is around 23.2dB, which is comparable to the DC gain of a transistor in this process. This amount of DC gain boost is also consistent with the calculation in (2-34). Figure 2.11(b) shows that the minimum amount of DC gain enhancement brought by the proposed technique is about 21.1dB when supply voltage varies from 1.3V to 1.8V across all process corners. Figure 2.11(c) demonstrates that under the nominal supply voltage of 1.5V and room temperature, the proposed technique provides at least 22.5dB DC gain boost across all process corners when output voltage varies from 0.1V to 1.3V.

Another noteworthy finding is that the DC gain enhancement brought by the proposed technique still has potential for further improvement and such potential can be easily realized. This is because the constraints on the current DC gain enhancement mostly originate from variations in process corner instead of temperature, OSW or supply voltage, as can be observed from Figure 2.11. These constraints can be reduced through a one-time trimming, which can be implemented using either registers or one-time programmable elements (OTP).

#### 2.8.6.2 Measured DC Gain vs. OSW

The lab setup shown in Figure 2.12(a) is used to measure the DC gain of the two op amps, in which  $V_{cm1}$  and  $V_{cm2}$  are set as half of the supply voltage using two voltage calibrators DVC 8500. The servo loop in Figure 2.12(a) keeps the DUT's output equal to  $V_{force}$  by adjusting the DUT's inverting input accordingly. The voltage change at the DUT's inverting input is amplified by a resistor ratio of 1000 and then low passed to be measured by a multi-meter of Agilent 344401A. The DC gain of the two op amps (DUT) is calculated by  $A_{OL} =$  $|\Delta V_{force}/\Delta V_{out}| * 1000$ . The measured DC gain of the proposed and conventional op amps is



shown in Figure 2.12b). It can be seen that more than 26.4dB DC gain is brought by the proposed method compared with its counterpart method. In addition, this DC gain boost drops only by 1dB over OSW of 0.1V~1.4V under a supply voltage of 1.5V.



Figure 2.12: Op amps DC gain measurement (a) schematic (b) lab setup (c) measured DC gain vs. OSW





Figure 2.13: Op amps' DC gain under P.Mis variation (a) proposed op amp (b) conventional op amp (c) gain enhancement

The dependency of the DC gain enhancement on critical transistors' mismatch has been analyzed in Section 2.6. The Monte Carol simulation of 200 runs is used to check the effectiveness of the method under process corner and random mismatch (P.Mis) variations. The Monte Carol simulated DC gain of the proposed and conventional op amps is shown in



Figure 2.13(a) and (b). The DC gain of the conventional op amp ranges from 83.49dB to 92.77dB with a mean of 89.77dB and a sigma of 1.8dB whereas the proposed op amp produces a DC gain ranging from 104.2dB to 150.2dB with a mean of 123.9dB and a sigma of 9.19dB. The Monte Carol simulated DC gain boost brought by the proposed conductance cancellation method is shown in Figure 2.13(c). The mean, sigma, maximum, and minimum of the DC gain enhancement are respectively 34.13dB, 8.3dB, 59.12dB, and 18.8dB.

#### 2.8.6.4 Simulated AC Frequency Response

Figure 2.14 shows the AC responses of the proposed and conventional op amps when they are placed in a unity buffer structure with a resistive and capacitive load of  $20K\Omega$  and 40pF. It can be seen that the proposed DC gain enhancement method increases the op amp's low frequency gain while preserving the conventional op amp's high frequency response. This is consistent with the frequency analysis. The simulation results show that the two op amps have the same GBW and PM of 13.6MHz and 55.7°.



Figure 2.14: Post-layout simulated AC responses of the prop. and conv. op amps



## 2.8.6.5 Measured Transient Response

The measured transient responses of the proposed and conventional op amps with a unity gain buffer configuration are almost the same and shown in Figure 2.15. The blue curve is the 0.5V input step voltage and the red curve is the output of the proposed op amp. The rising and falling slew rates of the two op amps are about 19.4 V/ $\mu$ s and 14.38V/ $\mu$ s.



Figure 2.15: Measured transient performance of the proposed op amp

### 2.8.6.6 Simulated Noise Performance

The input referred voltage noise densities of the proposed and conventional op amps are shown as red and blue curves in Figure 2.16. As expected, the two op amps have almost the same noise performance. Specifically, the integrated voltage noise of the proposed and conventional op amps from 0.1 to 10Hz are respectively 7.847uV and 7.815uV. Among the integrated noise from the proposed op amp, about 95.6%, 3.1% and 1.3% are respectively from the op amp core, M9, and M24 in Figure 2.6.





Figure 2.16: Post-layout simulated noise performance of the two op amps

# 2.8.6.7 Performance Summary and Comparison

Table 2.2 shows a performance comparison of the proposed and conventional op amps. Both simulated and measured results of the two op amps demonstrate the effectiveness and robustness of the proposed conductance cancellation method for DC gain enhancement.

Compared with previous work [3][5], this work provides a simpler, more robust, and costefficient solution to enhance DC gain of an op amp. For example, the DC gain boost of the fully differential op amp in [3] drops by 33dB with OSW between -0.24V and 0.24V under supply voltage of 3V, while the amount of boost in this work only drops by 1dB with OSR (output swing) between 0.1V and 1.4V under 1.5V supply. The normalized sensitivity of the DC gain boost with respect to OSW, S<sub>OSR</sub>, can be calculated as  $\Delta A_{en}/OSR * V_{supply}$ . The S<sub>OSR</sub> of this work and [3] are respectively 0.5dB and 412.5dB. A detailed comparison between this work and [3][5] is shown in Table 2.3. With all process corner variations considered, A<sub>PT\_min</sub>, A<sub>PS\_min</sub>, and A<sub>POSW\_min</sub> separately represent the minimum DC gain enhancement under temperatures between -40°C and 80°C, supply voltage between 1.3V and 1.8V, and OSR



and maximum DC gain enhancement of the proposed method based on Monte Carol simulation results.

| Op Amps                                                 | Proposed                             | Conv.                                |
|---------------------------------------------------------|--------------------------------------|--------------------------------------|
| <sup>+</sup> DC gain (dB)                               | 108                                  | 80                                   |
| Gain bandwidth product (MHz)                            | 8.0                                  | 8.0                                  |
| Phase margin (°)                                        | 50                                   | 50.3                                 |
| *Input referred noise (µV <sub>rms</sub> ) (0.1Hz-1Hz)  | 5.719                                | 5.694                                |
| *Input referred noise (µV <sub>rms</sub> ) (0.1Hz-1MHz) | 16.467                               | 16.409                               |
| Capacitive and resistive load                           | $40 \mathrm{pF}/20 \mathrm{k}\Omega$ | $40 \mathrm{pF}/20 \mathrm{k}\Omega$ |
| $SR+/SR-(V/\mu s)$                                      | 19.38/14.38                          | 19.30/14.30                          |
| Supply voltage (V)                                      | 1.5                                  | 1.5                                  |
| Current consumption (µA)                                | 1128                                 | 1108                                 |
| Area $(\mu m^2)$                                        | 14836                                | 14432                                |
| Process technology                                      | IBM 130nm CM                         | OS                                   |

Table 2.2: Sumamry of measured performance of the two op amps

<sup>+</sup> DC gain measurement setup in Figure 2.12 a), \* postlayout simulation resut

|                             | [3] Yan        | [5] He      | This work   |
|-----------------------------|----------------|-------------|-------------|
| CMOS Process                | 0.5µm          | 0.5µm       | 0.13µm      |
| Supply voltage (V)          | 3              | -           | 1.5         |
| Current consumption (mA)    | 15             | -           | 1.128       |
| (excluding tuning circuits) |                |             |             |
| DC gain (dB)                | >83            | >60         | 108         |
| DC gain boost (dB)          | -              | -           | 26.4        |
| OSW (V)                     | -0.24~0.24     | -           | 0.1~1.3     |
| Drop in DC gain boost over  | 33             | -           | 1           |
| OSW (dB)                    |                |             |             |
| SOSW (dB/V)                 | 68.75          | -           | 0.77        |
| APT_min (dB)                | -              | -           | 23.2        |
| APS_min (dB)                | -              | -           | 21.1        |
| APOSW_min (dB)              | -              | -           | 22.5        |
| APMS_min, APMS_mean         |                |             | 18.8, 34.1, |
| and APMS_max (dB)           |                |             | 59.1        |
| Tuning circuits             | High-gain low- | 16bDAC,     | NA          |
|                             | offset         | comparator, |             |
|                             | comparator     | MCU         |             |
| Power/Area overhead (%)     |                |             | 1.8/2.8     |

Table 2.3: Performacne Comparison to the literature



### 2.9.A Folded Cascode Amplifier with the FVA-based GE Technique [6]

The FVA-based gain enhancement technique is also suitable for folded cascode amplifiers (FCAs). Figure 2.17 shows a folded cascode amplifier (FCA) design with the FVA-based gain enhancement technique. In this work, three two-stage fully differential op amps are designed in the IBM 130nm CMOS process. The two op amps share the same core amplifier as shown in Figure 2.17 c), except that the first one (conventional) does not have any gain enhancement technique, whereas the second op amp has the aforementioned FVA-based gain enhancement technique. In the core amplifier, transistors M3 and M4 are cascoded to the op amp's input pair, M1 and M2, so that the conductance looked up from the drain of M4 is much smaller than gds25. Thus, gds25 is the main positive conductance of the NMOS side to be cancelled in order to achieve DC gain enhancement. The second stage of the op amp is a folded mesh class-AB output stage.



Figure 2.17: A fully differential FCA with the FVA-based technique

$$g_{\rm D} \approx g_{\rm ds25} \left( 1 - \frac{g_{\rm ds24} g_{\rm m25}}{g_{\rm ds25} g_{\rm m24}} \frac{1 + \eta_{31}}{1 + \eta_{23}} \right) \approx -g_{\rm ds25}(\varepsilon_1 + \varepsilon_2)$$
(2-57)

Similar to the conductance cancellation analysis in section 2.6,  $g_D$  is the net conductance looking down from the drain of transistor M25 and can be found as (2-57), in which  $\eta_{31}$  and



 $\eta_{23}$  are the body effect coefficient of transistors M31 and M23 respectively. The values of  $\eta_{31}$  and  $\eta_{23}$  are both close to 0.15 in this process. Also,  $\varepsilon_1$  is the transistor's intrinsic gain mismatch between transistors M24 and M25 and  $\varepsilon_2$  is the body effect mismatch between transistors M23 and M31.  $\varepsilon_1$  is about 5% in this design but  $\varepsilon_2$  is negligibly small. Therefore, the expected  $g_D$  is about 20 times less than  $g_{ds25}$ . The similar amount of conductance reduction is expected from the PMOS side conductance cancellation circuit. Therefore, the expected  $g_B$ , the net conductance looking up from the drain of M6, is about 20 times less than gds6.





Figure 2.19:  $g_D$  and  $g_B$  under P.T variation a)  $g_D$  b)  $g_B$ 





Figure 2.20: DC gain of the proposed and conventional op amp

As discussed in section 2.86, the FVA-based gain enhancement technique is mainly affected by the process variation. The original positive conductance is  $g_{D1}$  by looking down from M25's drain node and is  $g_{B1}$  by looking up from M6's drain node. Figure 2.18 shows the variation of the original positive conductance under process corner and temperature (P.T) variation. As can be seen that  $g_{D1}$  varies from  $35\mu$ S to  $110\mu$ S, whereas  $g_{B1}$  varies from  $22\mu$ S to  $40\mu$ S. Compared to  $g_{B1}$ , the larger variation range of  $g_{D1}$  is caused by an inherently wider spread of NMOS transistors' transconductance and conductance over process and temperature variations. The net conductance by looking down from M25's drain node and by looking up from M6's drain node are annotated as  $g_D$  and  $g_B$  respectively. The simulated  $g_D$  and  $g_B$  are shown in Figure 2.19. As can be seen, the  $g_D$  only varies from -9.62 $\mu$ S to 2.93 $\mu$ S and  $g_B$  only changes from -1.09 $\mu$ S to 2.58 $\mu$ S over process and temperature variations. The much smaller absolute values of  $g_D$  and  $g_B$  confirms the conductance cancellation effects of the FVA-based gain enhancement technique. The amounts of DC gain enhancement arisen from the technique are shown in Figure 2.20. It shows that the minimum amount of the DC improvement is 28.9dB



over process and temperature variations. This verifies the effectiveness and robustness of the proposed FVA-based gain enhancement for folded cascode op amps. The performance summary of the two designed op amps is shown in Table 2.4.

Without the aid of any tuning circuit, the proposed FVA-based gain enhancement technique keeps a DC gain enhancement of over 28.9dB under temperatures between -40 and 80°C, over 27.6dB under supply voltage between 1.4V and 2V, and over 29dB under differential output swing between -1.1V and 1.1V. The power and area overhead of the gain enhancement circuit are respectively only 7% and 3% of those of the conventional op amp.

| <b>Op Amps</b>                    | Conventional   | Proposed   |  |
|-----------------------------------|----------------|------------|--|
| DC gain (dB)                      | 87.4           | 131.4      |  |
| Load capacitor (pF)               | 20             | 20         |  |
| GBW/ UGF (MHz)                    | 73.1/66.51     | 75.5/67.53 |  |
| PM (°)                            | 53.9           | 53.6       |  |
| $SR+/SR-(V/\mu s)$                | 51.2/51.3      | 51.3/51.9  |  |
| 1% settling time (ns)             | 40.8           | 40.0       |  |
| 0.01% settling time (ns)          | 66             | 66         |  |
| 0.0001% settling time(ns)         | NA             | 90         |  |
| Supply voltage (V)                | 1.5            | 1.5        |  |
| Current (µA)                      | 1141           | 1221       |  |
| Estimated area (mm <sup>2</sup> ) | 0.0532         | 0.0548     |  |
| Process technology                | IBM 130nm CMOS |            |  |

Table 2.4: Performance summary of the designed op amps

## 2.10. Discussion

The proposed SDC-based and FVA-base gain enhancement techniques are ultimately limited by intrinsic gains of the critical transistors such as transistors M5, M11, M16 and M17 in Figure 2.6. This limitation can be mitigated via replacing the critical transistors by compound transistors or gain blocks which have much higher DC gain than a single transistor's DC gain.

As for the design of the lower gain amplifier in both SDC-based and FVA-base gain enhancement techniques, the amplifier's DC gain constancy is very critical. Non-constant DC gain of the amplifier under PVT variations will need significant design and simulation efforts



to achieve large DC gain enhancement. This has been analyzed and discussed in detail in [5]. Fortunately, the low gain amplifier or the level shifter in the SDC-based and FVA-based gain enhancement technique in Sections 2.5 and 2.6 have very good gain constancy.

#### 2.11. Summary

A new gds cancellation method to robustly improve op amps' DC gain with negligible power and area overhead has been introduced. The method can be implemented based on the source degeneration circuit (SDC) and the flipped voltage attenuator (FVA). Compared to the FVAbased method, the SDC-based technique is more suitable for the CMOS processes, in which transistors' threshold voltages are too low for the transistors to work in weak or strong inversion regions in the FVA configuration. Otherwise, the FVA-based technique is recommended as this technique is more robust to devices' random mismatch. A prototype current mirror input op amp with the FVA-based technique is designed and fabricated in the IBM130nm process. The measurement and simulation results of the prototype verify that the technique effectively enhances an op amp's DC gain (>20dB) and is very robust over process, voltage and temperature variations. Another prototype folded cascode amplifier design with the FVA-based technique also shows large DC gain enhancement.

The simulation and measurement results agree well with the theoretical analysis. The effectiveness of the proposed gain enhancement method is supported by the measurement and post-layout simulation results of two prototype op amps in presence of variations in process, temperature, supply voltage, output voltage swing, and random mismatch. The design simplicity, gain enhancement effectiveness, low power and area overhead, and zero degradation on settling time performance make the proposed gain enhancement method



suitable for many high precision applications such as switched-capacitor circuits and sigmadelta converters.

## **2.12. References**

- [6]. K. Bult and G. J. G. M. Geelen, "A fast-settling CMOS op amp for SC circuits with 90-dB DC gain," in *IEEE Journal of Solid-State Circuits*, vol. 25, no. 6, pp. 1379-1384, Dec 1990.
- [7]. J. Yan and R.L. Geiger, "Fast-settling CMOS operational amplifiers with negative conductance voltage gain enhancement," *ISCAS*, 2001, Sydney, Australia, pp. 228-231
- [8]. J. Yan, and R.L. Geiger, "A high gain CMOS operational amplifier with negative conductance gain enhancement," *CICC 2002*, Orlando, FL, USA, pp. 337- 340, 2002
- [9]. C. He, L. Jin, D. Chen and R.L. Geiger, "Robust High-Gain Amplifier Design Using Dynamical Systems and Bifurcation Theory With Digital Postprocessing Techniques, "*IEEE TCAS I*, vol. 54, no. 5, pp. 964-973, May 2007
- [10]. B. Huang and D. Chen, "Power-efficient, PVT robust conductance cancellation method for gain enhancement," *Electronics Letters*, vol.49, no.16, pp.,, Aug. 1 2013
- [11]. B. Huang and D. Chen, "An Effective Conductance Cancellation Method with Minimal Design Effort", IEEE Midwest Symposium on Circuits and Systems (MWSCAS), 2014, College Station, TX
- [12]. B. Huang and D. Chen, "A High Gain Operational Amplifier via an Efficient Conductance Cancellation Technique", *IEEE Custom Integrated Circuits Conference* (CICC), 2014, San Jose, CA, USA



# CHAPTER 3. SLEW RATE ENHANCEMENT FOR OPERATIONAL TRANSCONDUCTANCE AMPLIFIERS

## **3.1.Introduction**

In applications of switched-capacitor circuits and other applications with large capacitive loads such as liquid crystal display drivers, OTAs must provide sufficient slew rate (SR) to achieve fast settling performance. In a conventional Class-A OTA, as shown in Figure 3.1, its SR and gain-bandwidth product (GBW) are given by (3-1). To maximize the gm/Itail efficiency and optimize the OTA's noise and GBW, the input pair (M1 and M2) usually work in weak inversion regions with overdrive voltage typically around 70~80mV. Therefore, the ratio of SR to GBW is derived as (3-2) and its value is about 0.1V.

When a sine-wave with a frequency of GBW/ $2\pi$  is applied at the input of the OTA in the configuration of a noninverting unity gain buffer, the ideal output voltage of the OTA, Vout, is given by (3-3). In order to avoid slew rate induced distortions at V<sub>out</sub>, the OTA's SR (~0.1\*GBW) needs to be larger than the fastest voltage change rate of V<sub>out</sub>. The fastest change rate happens at the zero-crossing point and is equal to is GBW\*A. Therefore, if the peak-to-peak V<sub>out</sub> voltage is more than 0.2V at frequency of GBW/ $2\pi$ , the OTA's limited slew rate starts to cause distortion. In order to improve the linearity of low gain OTAs, it is very important to decouple their gain bandwidth product (GBW) and slew rate (SR), and to preserve OTAs' DC and small signal performance. In an effort to improve the slew rate of OTAs with small static power consumption, several different methods have been reported in the literature and will be reviewed in Section 3.2.

$$GBW = \frac{g_{m1}}{C_L}; \quad SR = \frac{I_{tail}}{C_L}$$
(3-1)



$$\frac{SR}{GBW} = \frac{\frac{I_{tail}}{C_L}}{\frac{g_{m1}}{C_L}} = \frac{I_{tail}}{g_{m1}} = \frac{2I_1}{g_{m1}} = 2nV_T \approx 0.1V$$
(3-2)  
$$V_{out}(t) = A\sin(GBW * t)$$
(3-3)



Figure 3.1: Conventional Class-A operation transconductance amplifier **3.2.Literature Review** 

In the literature, many different slew rate enhancement (SRE) methods [1-6] have been proposed but they all suffer from various drawbacks. For example, some SRE methods [1] [2] are incompatible with low supply voltage, some [3-5] degrade amplifier linearity, some [6] are sensitive to input common mode range (ICMR), yet others [4] require complex circuits producing large power and area overhead.

One of the widely used SRE methods for OTAs is the adaptive biasing scheme [3], as shown in Figure 3.2. The current mirror ratios of all current mirrors in Figure 3.2 are 1:1 except the current mirrors of M17-M20. The current mirror ratios of M17-M18 and M20-M19 are both 1:A. When a positive differential signal  $v_{id}$  is applied at the inputs of the OTA, I2 becomes larger than I1, where I1 and I2 respectively denotes the drain currents of M1 and M2. The



T

absolute current difference between I1 and I2 is sensed by current subtraction circuits formed by M16-M22 and is feedback to the tail current of the OTA. Assuming both M1 and M2 work in the weak inversion region,  $I_1 + I_2 = A|I_1 - I_2| + I_p$  and  $I_1 = I_2 \exp(V_{id}/nV_T)$  can be obtained by writing the KCL equations at the common source node of the input pair, node X. Thus, I1 and I2 can be found as (3-4) and (3-5), where  $V_T$  is the thermal voltage. The output current of the OTA is the current difference between I1 and I2 and is derived as (3-6).



Figure 3.2: An OTA with the adaptive biasing circuit [3]

$$I_{1} = \frac{I_{p} \exp(V_{id}/nV_{T})}{(A+1) - (A-1)\exp(V_{id}/nV_{T})}$$
(3-4)

$$I_{2} = \frac{I_{p}}{(A+1) - (A-1)\exp(V_{id}/nV_{T})}$$
(3-5)

$$I_{out} = I_1 - I_2 = I_p \left[ \frac{-1 + \exp(V_{id}/nV_T)}{(A+1) + (1-A)\exp(V_{id}/nV_T)} \right]$$
(3-6)

For large signal operation,  $v_{id} \gg nV_T$ , output peak current is obtained as (3-7). Equation (3-7) implies the peak current Ipk is very large when A=1. But the peak current cannot be infinite since when the drain current of the input pair becomes large, the input pair will leave the weak inversion region and equations (3-6) and (3-7) are no longer valid. For small signal operation,



(3-6) is applicable. The transconductance of the input pair, gm, is defined as  $\partial I_{out} / \partial V_{id}$  and is calculated as (3-8) accordingly. As can be seen from (3-8), gm varies as differential input signal changes when A is not equal to zero. This dependency of gm on differential signal degrades the linearity of the OTA compared with the conventional OTA where A=0. The reason of the loss in linearity is that the adaptive circuit does not distinguish between small signal and large signal operations. Comparatively, the adaptive biasing circuit is always on as long as a differential signal is applied. In an effort to improve the slew rate of an OTA while not degrading the OTA's linearity, the desired features of SRE circuits are discussed in the next section.

$$I_{pk} = I_{out|V_{id} \gg nV_{T}} = \begin{cases} \frac{I_{p}}{1 - A}, & 0 \le A < 1\\ unpredicted current, A \ge 1\\ \frac{2I_{p}}{nV_{T}} \end{cases}$$
(3-7)  
$$g_{m} \approx \frac{\frac{2I_{p}}{nV_{T}}}{(1 + A)^{2} \left(1 - \frac{V_{id}}{nV_{T}}\right) + (1 - A)^{2} \left(1 + \frac{V_{id}}{nV_{T}}\right) + 2(1 - A^{2})}$$
(3-8)  
$$\approx \frac{I_{p}/nV_{T}}{2 - 2AV_{id}/nV_{T}}$$

# **3.3.Desired Features of Slew Rate Enhancement Circuits**

In order to avoid linearity degradation, the proposed SRE method should be off for small signal and DC operations. However, when an amplifier is at the onset of slewing, the proposed SRE method should be activated to dynamically increase the SR of the amplifier. Several desired features of a proposed SRE method are listed as below: a) simple; b) low power and area consumption for SRE circuits; c) having a predefined turn on voltage for the SRE circuit. For small signal operation, the sensed voltage is smaller than the turn on voltage. Therefore, the SRE circuit stays off in small signal operation and avoids the aforementioned linearity degradation.



## **3.4. Proposed SRE Method via Excessive Transient Feedback**

#### 3.4.1 Concept of the slew rate enhancement via excessive transient feedback

The concept of the proposed SRE method is shown in Figure 3.3. First, a transient signal at the output stage that can be a single ended or differential voltage or current signal, xs, is sensed. Then the feedback signal, xfb, is generated to turn on/off the SRE circuit. xfb is a nonlinear function of xs. The relationship between xfb and xs is given in (3-9), where  $\alpha$  is a non-constant gain factor and xn is the threshold for extracting an excessive transient signal.



Figure 3.3: Concept of the proposed SRE method

$$x_{fb} = f(x_s) = \begin{cases} 0 & \text{if } |x_s| \le x_n \\ \alpha(|x_s| - x_n) & \text{if } |x_s| > x_n \end{cases}$$
(3-9)

When an amplifier is in the dc or small signal operation, that is, when  $|x_s| \le x_n$ , xfb is zero and the SRE feedback is turned off. However, when the amplifier's output stage is at the onset of slewing, xfb, the product of  $\alpha$  and the excessive transient signal,  $|x_s| - x_n$ , will be generated to turn on the SRE feedback. In order to ensure zero impact on the amplifier's small signal operation and an effective SR boost, xn and  $\alpha$  should be set properly.

#### **3.4.2** Selections of sensing and driving nodes for a SRE circuit





Figure 3.4: Different types of SRE methods

The selections of sensing and driving nodes or branches for a SRE circuit are very important. As shown in Figure 3.4, the selections of driving nodes or branches for a SRE circuit can be a) a tail current source b) an output node c) both a tail current source and output node. The benefit of boosting tail current is that large signal slew rates and gain bandwidth production of the OTA can be increased simultaneously. This maximizes the OTA's large signal operation speed. But this method of boosting tail current to improve slew rate requires that all the circuits in the OTA's large signal path have sufficient dynamic range to respond to a very large tail current without suffering from any long recovery time after slewing phase. This is usually more difficult to accomplish when the large signal path is long or involves many devices. By boosting the current directly to the OTA's output node, the requirement of the OTA's dynamic range can be mitigated because the small and large signal operation paths of the OTA are separate. The core of OTA can be optimized for small signal performance, while the SRE circuit can be designed for large signal performance improvement. But boosting transient current directly to the OTA's output node increases the SRE circuit' design complexity and requires additional large transistors in place to conduct the dramatically increased transient



current. This may significantly increase the OTA's area. In general, boosting tail current of an OTA is preferred if the OTA has a large dynamic range for design simplicity and compactness. Otherwise, boosting transient current to the output node is recommended.

For selections of circuit nodes or branches for slewing detection, the nodes or branches with the least delay from the OTA's output are generally preferred since SRE's turn-on and turnoff delays ultimately limit the effectiveness and robustness of a SRE circuit. In the example OTA in Figure 3.4, compared to the gates of transistors M3 and M4, the gates of transistors M1 and M2 are faster sensing nodes. Similarly, the gates of transistors M3 and M4 are faster than the gate of M7 in terms of slewing detection. However, sensing the fasting nodes is not always very straightforward. In the example OTA, the input nodes are the fastest nodes but still have a very wide input common mode range (ICMR). Therefore, the SRE circuit, sensing the input nodes, needs to be configured to accommodate this ICMR, which adds to the design complexity of the circuit. Because of this, one needs to make tradeoffs between design complexity of SRE circuit and delays of sensing nodes.

# **3.5.Design Example with the Proposed SRE Technique**

Based on the discussed SRE concept via excessive transient feedback, we present an OTA design with a simple SRE circuit as shown in Figure 3.5. The OTA consists of an OTA core and a proposed SRE circuit. The SRE is implemented to boost the OTA's tail current in this design because the OTA has a very wide dynamic range. Transistor M19 is designed to provide transient tail current. M19 is normally off in the quiescent or small signal operation to preserve the OTA's small signal operation and linearity. However, when the OTA is about to slew, M19 will be turned on heavily to provide a large dynamic tail current to effectively boost the OTA's SR and improve its large signal linearity.





Figure 3.5: Designed one-stage OTA with the proposed SRE method

As to slewing detection nodes, the OTA's internal nodes, A and C, are selected because they provide the optimal tradeoff between design complexity and speed of the sensing nodes. The voltages of nodes A and C,  $V_A$  and  $V_C$ , share the same common mode voltage,  $V_B$ , but have opposite differential voltage. Transistors M14-M16 have the same size, whereas M18 and M17 have a size ratio of n. In this work, n>2 is chosen so as to make sure that in the quiescent operation M15 and M16 work in the saturation region and M18 works in the triode region. As M18 works in the triode region, its drain source voltage,  $V_{DS18}$ , and gate source voltage of M19,  $V_{GS19}$ , are very small. As a result, the drain current of M19 is zero in the quiescent operation. Therefore, the OTA's DC operation is untouched by the proposed SRE circuit. In the quiescent operation, the KCL equation at the node of M19's gate is calculated as (3-10), in which  $\beta_{18}$  and  $\beta_{17}$  are respectively  $\mu_p C_{0x} W_{18}/L_{18}$  and  $\mu_p C_{0x} W_{17}/L_{17}$ .  $V_{DS18}$  is M18's drain source voltage and  $V_{od18}$  is M18's overdrive voltage. After solving (3-10),  $V_{DS18}$  is found as (3-11). To ensure that M19 works in the cutoff region in the quiescent operation,  $V_{DS18}$  should be less than the threshold voltage of M19.



$$\beta_{18} \left( V_{od18} - \frac{1}{2} V_{DS18} \right) V_{DS18} = 2 * \frac{1}{2} * \beta_{17} V_{od18}^2$$
(3-10)

$$V_{DS18} = \left(1 - \sqrt{1 - 2/n}\right) V_{od18}$$
(3-11)

Upon application of a differential signal,  $v_{id}$ , to the input of the OTA, the differential current of the input pair is annotated as  $I_d$  As devices  $R_{1,2}$ , M3 and M4 form a local common mode feedback (LCMFB). This LCMFB sets node B as a virtual ground and makes the voltage changes at nodes A and C complementary. In the presence of  $I_d$ , the voltages of  $V_A$  and  $V_C$ become respectively  $V_B + \Delta V$  and  $V_B - \Delta V$ , where  $|\Delta V| = 0.5 * I_d * R_{1,2}$ . When  $|\Delta V| < V_{od14}$ , both M15 and M16 stay on. When  $|\Delta V| > V_{od14}$ , either only M15 or only M16 is on. According to the square law model, the total drain current of M15 and M16 is found as (3-12). From (3-12), it can be found that  $I_{15} + I_{16}$  monotonically increases as  $|\Delta V|$  increases. Therefore, the gate voltage of M19, V<sub>ctrl</sub>, monotonically decreases as  $|\Delta V|$  increases. The voltage gain from  $|\Delta V|$ to V<sub>ctrl</sub> is very high when M18 works in the saturation region, and M15-M16 work in either the saturation region or the cutoff region. In this operation scenario, any voltage increases of  $|\Delta V|$ will dramatically reduce V<sub>ctrl</sub> and hence turn on M19 due to the high voltage gain from  $|\Delta V|$  to  $V_{\text{ctrl}}$ . Therefore, we define  $|\Delta V|$  and I<sub>d</sub>, ensuring that M18 work in the saturation region as the turn-on voltage ( $\Delta V_{on}$ ) and turn-on current of the SRE circuit (I<sub>d,on</sub>). When the SRE circuit is on, M15-M16 can work in either the saturation region or the cutoff region. According to the definitions of  $\Delta V_{on}$ , (3-13) is found by writing KCL equation at transistor M18's drain node, where n is the size ratio of transistor M18 to transistor M17. Equation (3-13) depends on the relationship between  $\Delta V_{on}$  and  $V_{od14}$ . If  $\Delta V_{on} < V_{od14}$ , transistor M15, M16 and M18 all work in the saturation region at the turn-on boundary of the SRE circuit. If  $\Delta V_{on} > V_{od14}$ , at the turnon boundary of the SRE circuit, M15 and M16 respectively work in the cutoff and saturation regions or vice versa, and M18 works in the saturation region. Therefore,  $\Delta V_{on}$  can be



calculated as (3-14). As can be seen,  $\Delta V_{on}$  is equal to  $V_{od14}$  if n=4. If n is smaller than 4, the first equation in (3-14) is valid; Otherwise the second is valid. For large signal operation,  $|\Delta V|$  is very large, making M15 or M16 work in the deep triode region and  $V_{ctrl}$  approximate Vss. Thus, M19 is turned on heavily and a large transient tail current is provided to the OTA to effectively boost its SR.

$$I_{15} + I_{16} = \begin{cases} \beta_{15} [V_{od14}^2 + \Delta V^2] & |\Delta V| < V_{od14} \\ \frac{1}{2} \beta_{15} [(V_{od14} + \Delta V)^2] & |\Delta V| > V_{od14} \end{cases}$$
(3-12)

$$\begin{cases} \beta_{15} [V_{od14}^{2} + \Delta V_{on}^{2}] = \frac{n}{2} \beta_{15} V_{od14}^{2} & \text{if } \Delta V_{on} < V_{od14} \\ \frac{1}{2} \beta_{15} [(V_{od14} + \Delta V_{on})^{2}] = \frac{n}{2} \beta_{15} V_{od14}^{2} & \text{if } \Delta V_{on} > V_{od14} \\ \\ \begin{cases} \Delta V_{on} = \sqrt{\frac{n}{2} - 1} V_{od14} & \text{if } \Delta V_{on} < V_{od14} \\ \Delta V_{on} = (\sqrt{n} - 1) V_{od14} & \text{if } \Delta V_{on} > V_{od14} \end{cases}$$
(3-14)

In short, the proposed SRE method has a predefined turn on voltage,  $\Delta V_{on}$ , with zero impact on an amplifier's DC operating point or small signal performance or linearity. Meanwhile, it can provide a very large dynamic current to effectively enhance an amplifier's SR when the amplifier slews, thus improving the amplifier's large signal linearity.

#### **3.5.1** Small signal analysis

As discussed earlier, the LCMFB, formed by devices  $R_{1,2}$ , M3 and M4, sets node B as a virtual ground. As a result, the effects of Cgs3 and Cgs4 on nodes A and C are eliminated. Compared to an OTA without the LCFMB, the poles associated with nodes A and C, calculated as (3-15), tend to be at a higher frequency, where  $R_A=R_{1,2}//r_{ds3,4}//r_{ds1,2}$  and CA is the total parasitic capacitor at node A. Since transistors M3~M6 have the same size, the gain bandwidth product (GBW) of the OTA can be obtained as (3-16), where  $g_{m1}$ ,  $g_{m5,6}$  are transconductance



of M1, M5-M6 and C<sub>L</sub> is the load capacitor. Therefore, the phase margin (PM) of the OTA is approximately found as (3-17). In order to ensure that  $P_A$  imposes little phase degradation on the OTA,  $R_{1,2}$  should be small. In this work, the value of  $R_{1,2}$  is close to  $1/g_{m3,4}$ , where  $g_{m3,4}$  is the small-signal transconductance of M3 and M4.

$$P_{A} = \frac{1}{2\pi R_{A}C_{A}}$$
(3-15)

$$BBW = \frac{g_{m1}R_A g_{m5,6}}{2\pi C_L}$$
(3-16)

$$PM \approx 90 - \tan^{-1} \frac{C_A g_{m1} g_{m5,6} R_A^2}{C_L}$$
(3-17)

#### **3.5.2** Large signal analysis

In the presence of a differential current in the input pair,  $I_d$ , the current flow in  $R_{1,2}$  can be found as  $I_d/2$  while  $V_A$  and  $V_C$  correspondingly become  $V_B+R_1I_d/2$  and  $V_B-R_1I_d/2$ . The currents in M5 and M6 in the output stage are obtained as (3-18). Since M7 and M8 have the same size, the current flow in  $C_L$  is equal to the current difference between  $I_5$  and  $I_6$  as given by (3-19), in which  $V_{od3}$  is proportional to the square root of  $I_1+I_2$ . This means that a boosted transient tail current always enhances the OTA's SR no matter whether the transient current in M1 and M2 is differential-mode current or common-mode current. Also, a large  $R_{1,2}$  in the LCMFB is helpful for SRE but reduces phase margin and stability of the OTA as shown in (3-17). Fortunately, the workings of the proposed SRE method does not require large resistors to achieve large SRE and hence the method satisfies the stability requirement.

$$I_{5} = \frac{1}{2}\beta_{3} \left( V_{od3} + \frac{R_{1,2}I_{d}}{2} \right)^{2}, I_{6} = \frac{1}{2}\beta_{3} \left( V_{od3} - \frac{R_{1,2}I_{d}}{2} \right)^{2}$$
(3-18)



$$I_{L} = \begin{cases} \beta_{3} V_{od3} R_{1,2} I_{d} & \text{if } I_{d} < 2V_{od3}/R_{1,2} \\ \frac{1}{2} \beta_{3} \left( V_{od3} + \frac{R_{1,2} |I_{d}|}{2} \right)^{2} & \text{if } I_{d} > 2V_{od3}/R_{1,2} \end{cases}$$
(3-19)

$$I_{d,on} = \frac{2 * \Delta V_{on}}{R_{1,2}} = \frac{2V_{od14}}{4R_{1,2}} = \frac{\beta V_{od3} g_{m3,4}}{2\alpha} = \frac{\beta I_{tail,Q}}{2\alpha}$$
(3-20)

In this design, the size ratio between M17 and M18 is n=2.125. To guarantee this ratio after fabrication, some sophisticated layout techniques or simple trimming circuits may need to be implemented. After plugging n into (3-14), the turn-on voltage,  $\Delta V_{on}$ , of the SRE circuit is found as 0.25V<sub>od14</sub>. Assuming that R<sub>1,2</sub> is  $\alpha/g_{m3,4}$  and V<sub>od14</sub> is equal to  $\beta V_{od3}$ , the obtained turn on differential current I<sub>d,on</sub> is given by (3-20), where I<sub>tail,Q</sub> is the quiescent drain current in M0. Equation (3-20) implies that as long as  $\beta/2\alpha$  is less than 1, the SRE circuit will be turned on when the OTA starts to slew. In this design,  $\beta=1$  and  $\alpha=1$ . After the OTA completes slewing and enters the small signal settling, I<sub>d</sub> becomes less than I<sub>d,on</sub>. This turns off the SRE circuit.

# **3.6. Simulation Results**

To show the effectiveness of the proposed SRE method, three one-stage single-ended OTAs are designed in the IBM 130nm process. The first OTA (conventional) and the second (proposed) share the same core amplifier as shown in Figure 3.5; but the conventional OTA does not have any SRE circuit whereas the proposed OTA has the proposed SRE circuit. The third OTA (adaptive) has an adaptive SRE circuit [3]. The three designed OTAs have almost the same unity gain frequency (UGF) of 7.3MHz and phase margin (PM) of 88° under the same capacitive load of 20pF. Small signal step responses of the three OTAs are shown in Figure 3.6. Unlike the adaptive method, the OTA with the proposed SRE circuit preserves the small signal step responses of the conventional OTA.





Figure 3.6: Small signal transient response of the three designed OTAs



Figure 3.7: Step responses of the three OTAs (a) output voltages (b) tail currents Upon application of 0.8V voltage step to the input of the OTA in the unity gain buffer configuration, the transient responses of the three OTAs are shown in Figure 3.7(a). As shown in Figure 3.7(a), the proposed SRE method improves the average slew rate of the conventional OTA by a factor of 2320% under power and area overhead of only 2% and 1.2%. Compared with the adaptive method [3], the proposed SRE method enhances the slew rate by more than 300% but with power and area overhead decreased by 11.1% and 25%. In the slewing phases, the corresponding transient tail currents of the three OTAs are displayed in Figure 3.7(b). The peak transient tail currents of the proposed OTA are 1158uA in the negative slewing phase and



871.6uA in the positive slewing phase, which are respectively about 4 and 2.4 times of the adaptive OTA, and 14.5 and 13.4 times of the conventional OTA. In addition, the linearity of the three OTAs is simulated with a 1MHz, 0.6V peak-to-peak voltage sine wave. The total harmonic distortion (THD) of the proposed OTA is respectively improved by 18dB and 6dB compared with the adaptive and conventional OTAs. The performance of the three designed OTAs is summarized and compared in Table 3.1.

| Denometer                                 | Conventional    | Adaptiva [2] | Drongad   |
|-------------------------------------------|-----------------|--------------|-----------|
| Parameter                                 | Conventional    | Adaptive [5] | Proposed  |
| Load Capacitor (pF)                       | 20              | 20           | 20        |
| DC Gain (dB)                              | 24.9            | 24.86        | 24.9      |
| UGF(MHz)                                  | 7.33            | 7.58         | 7.33      |
| PM (deg)                                  | 88.7            | 89           | 88.7      |
| $SR+/SR - (V/\mu s)$                      | 8/5.6           | 61.8/34.8    | 138/178.4 |
| THD (dBc) @ $V_{pp}=0.6V$ , $f_{in}=1MHz$ | -56.7           | -44.7        | -62.7     |
| Estimated Area (µm <sup>2</sup> )         | 8,214           | 11,065       | 8,310     |
| Current consumption (µA)                  | 252.9           | 290.2        | 258       |
| Supply Voltage (V)                        | 1.5             | 1.5          | 1.5       |
| Technology                                | IBM 0.13µm CMOS |              |           |

Table 3.1: Performance summary of the three designed OTAs

#### 3.7.Summary

A simple yet very effective SRE method has been introduced. Compared with the conventional OTA, the proposed OTA preserves small signal performance and improves SR by a factor of 2320% and THD by 6dB, but the power and area overhead is only 2% and 1.2% of those of the conventional OTA. Compared with the adaptive OTA, the SR and THD of the proposed OTA are respectively improved by 300% and by 18dB. Due to the little power consumption, small area overhead, design simplicity and high effectiveness of the proposed SRE method, the method is suitable for applications which need to provide large capacitive driving capability with low static power dissipation.



# **3.8.References**

- R. Castello, and P.R Gray, "A high-performance micropower switched-capacitor filter", *IEEE J. Solid-State Circuits*, vol. 20, no. 6, pp. 1122-1132, Dec. 1985
- [2]. B.W. Lee, and B.J Sheu, "A high slew-rate CMOS amplifier for analog signal processing", *IEEE J. Solid-State Circuits*, vol. 25, no. 3, pp. 885-889, June 1990
- [3]. M. Degranuwe, J. Rijmenants, E. A. Vittoz, and D. Man, "Adaptive biasing CMOS amplifiers", *IEEE J. Solid-State Circuits*, vol. 17, no. 3, pp. 522-528, June 1982
- [4]. R. Harjani, R. Heineke, and F. Wang, "An integrated low voltage class AB CMOS OTA", *IEEE J. Solid-State Circuits*, vol.34, no. 2, pp. 134-142, Feb 1999
- [5]. A.J. Lopez-Martin, S. Baswa, J. Ramirez-Angulo, and R.G. Carvajal, "Low-Voltage Super class AB CMOS OTA cells with very high slew rate and power efficiency", *IEEE J. Solid-State Circuits*, vol.40, no. 5, pp. 1068-1077, May 2005
- [6]. R. Klinke, B.J. Hosticka, and H. Pfleiderer, "A very-high-slew-rate CMOS operational amplifier", *IEEE J. Solid-State Circuits*, vol.24, no. 3, pp. 744-746, Jun 1989



# CHAPTER 4. POWER EFFICIENCY ENHANCEMENT FOR OP AMPS DRIVING LARGE CAPACITIVE LOADS

## **4.1. Introduction**

In modern high-resolution thin-film-transistor liquid-crystal display (TFT-LCD) displays, gamma correction must be performed to correct nonlinearities in the glass transmission characteristics of the LCD panel [1]. The typical LCD source driver for 64 bits of grayscale uses internal digital-to-analog converters (DACs) to convert the 6-bit data into analog voltages. These generated analog voltages are buffered by gamma buffers to drive large capacitor load in the range of 10nF to 100nF, which is used to provide the glitch energy during DAC conversions [2]. For these gamma buffers, the output voltage swing should be large and the DC gain should be more than 66dB for 10-bit resolution [3]. Other very important circuit parameters for these op amps are gain-bandwidth product (GBW), slew rate (SR), power consumption, and circuit area.

# **4.2.Literature Review**

## 4.2.1 General review

Multistage op amps are predominant approaches for gamma buffers in LCD applications because of their superior gain/speed-to-power ratios [4]-[8]. But all these multistage-amplifiers need complicated frequency compensations which significantly increase design complexity. Recently, single-stage amplifiers used as gamma buffers are becoming popular in LCD display applications. [3][9] are single stage amplifier designs for these applications and have reported favorable GBW and SR performance over multistage-amplifier counterparts [4]-[8]. The methods in [3] and [9] are reviewed in the next section



## 4.2.2 State-of-the-art methods

## 4.2.2.1 Nested Current Mirror Approach [3]

Figure 4.1 shows a basic cell working as the preamplifier (preamp) of the nested current mirror (NCM) amplifier [3]. In [3], multiple of the preamps are cascaded to improve the amplifier's GBW. Each preamp consists of a PMOS and a NMOS input pair. The PMOS input pair is always tied to the input signals (V1i and V2i) of the whole NCM amplifier, whereas the NMOS input pair is tied to the outputs from the prior preamp stage (V4i and V6i). The outputs of the current preamp stage are denoted as V3i and V5i. As the poles associated with the preamp stages are at much higher frequencies than the entire amplifier's GBW, cascading multiple preamp stages enhances the entire amplifier's GBW and gain.



Figure 4.1: Basic cell used in the nest current mirror based single stage op amp

$$\Delta v3 = (\Delta vth2 - \Delta vth1 + \Delta v2 - \Delta v1) * (k + 1) - \Delta vth3 - k$$
  
\* (\Delta vth4) (4-1)

$$\Delta v5 = -(\Delta vth2 - \Delta vth1 + \Delta v2 - \Delta v1) * (k+1) - \Delta vth5 - k$$
  
\* (\Delta v4 + \Delta vth6) (4-2)

However, when cascading multiple preamp stages, the random offset voltages from all the preceding preamp stages are also amplified. This can be illustrated by looking at how the offset



errors at the gates of the transistors M1 and M4 are amplified to preamp's outputs. Using the small-signal analysis technique at DC frequency, the random offset voltages at gates of M3 and M5 can be easily derived as (4-1) and (4-2), where  $\Delta V_i$  and  $\Delta V_i$  are respectively the total voltage error and threshold voltage error of transistor Mi and i=1, 2, ..., 6. k is the size ratio of M4 to M3. Equations (4-1) and (4-2) clearly show that any voltage errors at the NMOS input pair (M4 and M6), including their threshold voltage errors and voltage errors from preceding preamp stages, are amplified by k times to the output (the gates of M3 and M5). Similarly, the voltage errors at the PMOS input pair are amplified by k+1 times. Because of the voltage error amplification, the quiescent current in transistors M3 and M5 can deviate far from their nominal current. The voltage errors can be divided into differential-mode and common-mode voltage errors. The differential-mode errors at the inputs of the NMOS pair can be partially corrected as the offset voltage of the NCM amplifier in a closed loop configuration, whereas the common-mode voltage errors directly affect the quiescent currents of the preamp's output stages and the succeeding circuits. Due to the uncontrolled common-mode errors in [3], the quiescent currents of M3 and M5 can even become zero when more than three preamp stages are cascaded. The absence of well-defined quiescent currents of the NCM amplifier severely limits its robustness, yield and thus practical applicability.

### 4.2.2.2 Signal-Current Enhancer Approach [9]

The basic preamp circuit of another state-of-the-art op amp design [9] for driving large capacitive loads is shown in Figure 4.2. The preamp provides gain from its differential current input to its differential current output. The input current consists of a common-mode DC bias current,  $I_B$  and a differential-mode signal current, Is. Ideally, the differential signal current gain from the input to the output is (2K+1) with the transistors MP1~MP3's aspect ratio being 1:



K+1: K. By cascading n preamp stages, ideally the current gain is  $(2K+1)^N$  and the circuit's GBW is improved by  $(2K+1)^N$ . However, this approach suffers from severe tradeoffs among quiescent supply current constancy, power supply rejection, small-signal performance and large-signal performance. The tradeoffs are discussed below.



Figure 4.2: Basic cell used in [9]

Due to channel length modulation effects, the currents flowing out from nodes n3 and n4, are calculated as (4-3) and (4-4), in which  $\lambda_{\rm P}$  and  $\lambda_{\rm N}$  are the channel length modulation coefficients of PMOS and NMOS transistors. For simplicity, we assume  $\lambda_{\rm P} \approx \lambda_{\rm N}$  and the gate source voltages of the same type of transistors are the same, as expressed in (4-5). Therefore, the common-mode and differential-mode currents of In3 and In4 are expressed as (4-6) and (4-7). As can be seen in (4-6), the current errors due to channel length modulation are amplified by (K+1) times for a single preamp stage. The amplifier in [9] has five of the preamp stages in cascade, and thus the errors in the common-mode current are amplified by a factor of (K+1)<sup>5</sup>. The value of (K+1)<sup>5</sup> is as high as 1024. After plugging  $\lambda_{\rm N}$ =0.15, V<sub>DD</sub>=1.8V, V<sub>GS,MP4</sub>≈0.55V, V<sub>GS,MN1</sub>≈0.45V in this 180nm CMOS process, the common-mode current error is found as high as 123\*I<sub>B</sub> after cascading five of the preamp stages, which is significantly higher than the desired bias current, I<sub>B</sub>. The actual current error should be slightly smaller than the calculated



 $123*I_B$  because  $V_{GS,MP4}$  and  $V_{GS,MN1}$  also slightly increase when the bias current increases. Nevertheless, there is still a huge amplification factor for the current error. Consequently, the op amp in [9] is extremely sensitive to its supply voltage.

$$I_{n3} = (K+1)(I_{B} - i_{s}) \frac{1 + \lambda_{N}(V_{DD} - V_{GS,MP4})}{(1 + \lambda_{N}V_{GS,MN1})} - K(I_{B} + i_{s}) \frac{1 + \lambda_{P}V_{GS,MP4}}{1 + \lambda_{P}V_{GS,MP1}}$$

$$\approx (K+1)(I_{B} - i_{s})[1 + \lambda_{N}(V_{DD} - V_{GS,MP4} - V_{GS,MN1})] - K(I_{B} + i_{s})$$
(4-3)

$$I_{n4} = (K+1)(I_{B} + i_{s}) \frac{1 + \lambda_{P}(V_{DD} - V_{GS,MN4})}{1 + \lambda_{P}V_{GS,MP1}} - K(I_{B} - i_{s}) \frac{1 + \lambda_{N}V_{GS,MN4}}{1 + \lambda_{N}V_{GS,MN1}}$$

$$\approx (K+1)(I_{B} + i_{s})[1 + \lambda_{P}(V_{DD} - V_{GS,MP4} - V_{GS,MN1})] - K(I_{B} - i_{s})$$
(4-4)

$$V_{GS,MN4} \approx V_{GS,MN1}, V_{GS,MP4} \approx V_{GS,MP1}, \lambda_P \approx \lambda_N \propto \frac{1}{L}$$
(4-5)

$$I_{n,cm} = \frac{I_{n3} + I_{n4}}{2} = I_{B} + (K+1)I_{B} * \lambda_{N}(V_{DD} - V_{GS,MP4} - V_{GS,MN1})$$
(4-6)

$$I_{n,dm} = \frac{I_{n4} - I_{n3}}{2} = (2K + 1)i_s + (K + 1)i_s * \lambda_N (V_{DD} - V_{GS,MP4} - V_{GS,MN1})$$
(4-7)

In order to mitigate the quiescent current variation of [9] caused by its high sensitivity to supply voltage, either a fixed supply voltage source equal to  $V_{GS,MN1} + V_{GS,MP1}$  or transistors with long lengths are needed. In [9], the op amp is designed in a 130nm CMOS process and is powered by a 0.7V supply voltage. However, this actually leads to the demand of a sophisticated LDO design to provide a constant 0.7V voltage. This not only significantly increases the design complexity and area consumption but also degrades the maximum achievable slew rate (SR) of the op amp because the maximum SR is approximately proportional to the square of supply voltage. On the other hand, increasing the transistors' channel lengths would severely compromise an op amp's speed. The pole frequencies associated with gates of MP1 and MN1 in Figure 4.2 are found as  $f_{TP}/(2K+1)$  and  $f_{TN}/(2K+1)$  respectively, where  $f_{TP}$  and  $f_{TN}$  are unity current gain frequencies of MP1 and MN1. A transistor's unity current gain frequency decreases as its channel length increases with a



relationship shown in (4-8). As transistors' length changes,  $f_T$  changes faster than the channel length modulation coefficient,  $\lambda$ , which is proportional to 1/L. Therefore, to reduce  $\lambda$  by 10 times through increasing the transistor's length by 10 times,  $f_T$  will drop by about 31.6 times. Consequently, this severely degrades the preamp's speed.

$$f_{\rm T} = \frac{g_{\rm m}}{C_{\rm gs}} = \sqrt{\frac{2\mu I_{\rm d}}{W L^3 C_{\rm ox}}}$$
(4-8)

In addition, the small-signal performance such as GBW of [9] compromises its large-signal performance such as slew rate. In the slewing phases, the large transient current to the output is the amplified current of the input pair's differential current. Therefore, all the transistors in the op amp in [9] need to carry large transient currents so that the op amp's output transient current can be sufficient to charge or discharge the load capacitor. There are mainly two disadvantages of passing large current through multiple stages. First, in order to pass large transient current to output stage, the W/L ratios of the all the transistors should be large. For a given bias current and length of a transistor, a larger transistor width results in a smaller  $f_T$  as shown by (4-8). This leads to lower frequency of the non-dominant poles in the preamp, which ultimately limits the GBW of the entire op amp. Second, the transient current efficiency of the op amp is low. Ideally, in the slewing phases, we want all the generated large transient current to flow only into the load capacitor so minimal transient current is wasted at any intermediate stages in the op amp. But all the preamp stages in [9] waste a considerable portion of the large transient current passed to the load capacitor.

Last but not least, some bias currents of the preamp stage shown Figure 4.2 are wasted such as the drain currents of load transistors MN1, MN4, MP1 and MP4. Ideally, we want to have zero current wasted in any of the load devices so as to maximize the transconductance and GBW of the preamp for a given supply current.


# 4.3.Desired Features of Op Amp for Driving Large Capacitive Loads

In an effort to solve the problems that [3] [9] have, a desired op amp design for driving large capacitive loads should meet following requirements:

- a) Possesses a well-defined quiescent current for each branch of circuits
- b) Decouples small-signal and large-signal operations
- c) Has robust performance under random mismatch variations
- d) Eliminates current wasted in the preamp's load circuits

# 4.4.Concept of the Proposed Power-Efficient Op Amp Design for Driving Large Capacitive Loads



Figure 4.3: Proposed power-efficient op amp design for driving large capacitive loads

$$x_{fb} = f(x_s) = \begin{cases} 0 & \text{if } |x_s| \le x_n \\ \alpha(|x_s| - x_n) & \text{if } |x_s| > x_n \end{cases}$$
(4-9)

The conceptual power-efficient op amp design for driving large capacitive loads is shown in Figure 4.3. Unlike [3][9], this op amp design decouples its small- and large-signal paths. The small-signal enhancement path, as shown by the blue arrow, consists of two voltage-to-current converters (V-I), one current-to-voltage converter (I-V) and multiple voltage-to-voltage converters (V-V). All the converters except the output stage class AB V-I converter work as



the preamp stages of the op amp and the preamp stages do not need to carry a large transient current in the slewing phases. As a result, unlike [3][9], the demand of large transistor sizes in the preamp stages due to the impact of large-signal operation is eliminated. Therefore, all the preamp stages in this work can be mainly designed for small-signal performance improvement. In addition, the quiescent current of all the circuits in the op amp is well defined. Furthermore, the design of V-V stages, generating the largest amount of gain and small-signal improvement, wastes zero current in the V-V stages' load circuits. This increases the power efficiency of the preamp and the entire op amp compared with [3][9].

As for the large-signal performance enhancement path, shown by the red arrow, it senses internal nodes of the input V-I and detects if the op amp is in the slewing phase. The largesignal enhancement circuit is a nonlinear function of the sensed signal. The nonlinear function is similar to the function of the introduced slew rate enhancement (SRE) circuit in Chapter 3 and is repeated as (4-9), where  $x_s$  and  $x_n$  are respectively the sensed signal and the threshold voltage or current for the sensed signal to activate the SRE circuit. In addition,  $x_{fb}$  is the control signal to activate the SRE circuit when  $x_{fb}>0$  or to deactivate the SRE circuit when  $x_{fb}=0$ . When the op amp is in the dc or small-signal operation, that is, when  $|x_s| \le x_n$ ,  $x_{fb}$  becomes zero, deactivating the SRE circuit. When the op amp's input stage is at the onset of slewing,  $x_{fb}$ , the product of  $\alpha$  and the excessive transient signal,  $|x_s| - x_n$ , will be generated to turn on the SRE circuit so as to increase the tail current of the last preamp stage. Due to the existence of both large input signals and increased tail current at the last preamp stage, the preamp stage generates large differential output voltages. As a result, the output class AB V-I driver generates a large transient current to the load capacitor to boost the op amp's slew rate.



In summary, the benefits of the proposed power-efficient op amp design are shown as below.

- The small-signal and large-signal paths for performance enhancement are decoupled. This eliminates the aforementioned trade-offs between GBW and the capability to convey large transient current. In addition, this improves transient current efficiency of op amps during the slewing phase since only the output stage conducts large current to charge/discharge load capacitor.
- 2) All the used circuits have well-defined quiescent current. This eliminates the aforementioned trade-offs between GBW and quiescent current variations of op amps.
- 3) Zero bias current is wasted in the V-V preamp design. This increases the power efficiency of the V-V preamp stage and the entire op amp.

### **4.5.Design Example**

In this section, we will demonstrate a power-efficient op amp design driving a large capacitive load, i.e.15nF, with the proposed preamp stage. The power efficient design strategy for the op amp will be discussed.

#### **4.5.1** Design of the V-V preamp stage

The schematic of the V-V preamp stage is shown in Figure 4.4. The inputs and outputs of the preamp are V1+/V1- and V2+/V2- respectively. The preamp has a well-defined quiescent current because the transistor M6 has a fixed bias current. All the bias currents are used to generate transconductance of both NMOS and PMOS transistors, i.e. M1 and M3. Zero current is wasted in the preamp's load circuits, which are the two resistors, R. These two resistors also



form a local common-mode feedback loop to define the preamp's output common-mode voltage.



Figure 4.4: Schematic of the designed V-V preamp stage

## 4.5.1.1 Large-signal Analysis of the Preamp Stage



Figure 4.5: a) Positive slewing phase of the last preamp stage b) op amp output stage

$$V_{2,dm} = \frac{V_{2+} - V_{2-}}{2} = I_{tail} * R$$
(4-10)



Figure 4.5 shows the designed op amp's class AB output stage and its preceding preamp stage. As shown in Figure 4.5(a), transistors M3-M5 and the two resistors R form a local common-mode feedback loop in the quiescent or small-signal operation. In the quiescent operation, M5 is biased in the saturation region to define the preamp's common-mode output voltage,  $V_{2,cm}$ , which then defines the quiescent current of the op amp's class AB output stage shown in Figure 4.5(b). When the preamp's input differential voltage,  $V_{1,dm} = 0.5*(V_{1+} - V_{1-})$ , is larger than  $0.7*V_{od1}$ , the preamp is in a positive slewing phase. In this phase, the preamp's differential-mode output voltage,  $V_{2,dm}$ , is calculated as (4-10), and its  $V_{2,cm}$  depends on both  $V_{2,dm}$  and the operation regions of M3 and M5.  $V_{2,dm}$  and  $V_{2,cm}$  are analyzed as follows.

a) In the positive slewing phase, V<sub>1+</sub> and V<sub>1</sub>. respectively increases and decreases by at least  $(\sqrt{2} - 1)^*V_{odi}$  from their quiescent voltages, where V<sub>odi</sub> is the quiescent overdrive voltage of transistor M<sub>i</sub>, i=1,2,...4. The values of the transistors' V<sub>odi</sub> are close to each other. If V<sub>2,dm</sub> is so small that M3 and M5 work in the saturation region in the slewing phase, V<sub>2,cm</sub> can be found as (4-11), in which I<sub>tail</sub> is the drain current of M6 and V<sub>th5</sub> is M5's threshold voltage. In addition,  $\mu$ , C<sub>ox</sub>, W<sub>i</sub> and L<sub>i</sub> are respectively transistor M<sub>i</sub>'s mobility, gate oxide capacitance, width and length. As M3 and M5 work in the saturation region, V<sub>2</sub>. is larger than  $\sqrt{2}V_{od3} + V_{od5}$  as expressed by (4-12). After solving (4-12), the expression I<sub>tail</sub> \* R < V<sub>th5</sub> -  $\sqrt{2}V_{od3}$  is found. Then based on (4-13), V<sub>2+</sub> is found to be smaller than V<sub>od5</sub> + 2V<sub>th5</sub> -  $\sqrt{2}V_{od3}$ . Therefore, in the positive slewing phase, V<sub>2+</sub> is smaller than the supply voltage by at least 4.4\*V<sub>od</sub>, because the supply voltage is higher than V<sub>od5</sub> + V<sub>od6</sub> + V<sub>od1</sub> + V<sub>od3</sub> + V<sub>th1</sub> + V<sub>th3</sub>, which is approximately 4\*V<sub>od</sub>+2\*V<sub>th</sub>. This concludes that V<sub>2+</sub> is not maximized if M3 and M5 still work in the saturation region in the slewing phase.



$$V_{2,cm} = \frac{V_{2+} + V_{2-}}{2} = \sqrt{\frac{2 * \text{Itail} * L_5}{\mu C_{\text{ox}} W_5}} + V_{\text{th5}} = V_{\text{od5}} + V_{\text{th5}}$$
(4-11)

$$V_{2-} = V_{2,cm} - V_{2,dm} = V_{od5} + V_{th5} - I_{tail} * R > \sqrt{2}V_{od3} + V_{od5}$$
(4-12)

$$V_{2+} = V_{2,cm} + V_{2,dm} = V_{od5} + V_{th5} + I_{tail} * R < V_{od5} + 2V_{th5} - \sqrt{2}V_{od3}$$
(4-13)

b) In the positive slewing phase, if V<sub>2,dm</sub> is large enough to make M3 and M5 work in the triode and saturation region respectively, the common-mode output voltage is still calculated as (4-11) because M5 still works in the saturation region. The drain source voltage of M5, V<sub>ds5</sub>, is given by (4-14), where R<sub>on3</sub> is the on resistance of transistor M3 working in the triode region. In order to keep M5 working in saturation region, V<sub>ds5</sub> needs to be larger than V<sub>od5</sub>. Thus, it is found that V<sub>th5</sub>>I<sub>tail</sub>\*(R+R<sub>on3</sub>). The expression of V<sub>2</sub>., given as (4-15), is derived from the fact that the total drain source voltage of M3 and M5 is smaller than the sum of their overdrive voltage in the slewing phase because M3 works in the triode region. Equation (4-16) about V<sub>2+</sub> can be easily derived after plugging V<sub>th5</sub>>I<sub>tail</sub>\*(R+R<sub>on3</sub>). V<sub>2+</sub> is smaller than the supply voltage, around 4\*V<sub>od</sub>+2\*V<sub>th</sub>, by at least 3\*V<sub>od</sub>. Therefore, in this scenario, V<sub>2+</sub> is still not maximized in the positive slewing phase.

$$V_{ds5} = V_{od5} + V_{th5} - I_{tail} * (R + R_{on3}) > V_{od5}$$
(4-14)

$$V_{2-} = V_{2,cm} - V_{2,dm} = V_{od5} + V_{th5} - I_{tail} * R < \sqrt{2}V_{od3} + V_{od5}$$
(4-15)

$$V_{2+} = V_{od5} + V_{th5} + I_{tail} * R < V_{od5} + 2V_{th5} - I_{tail} * R_{on3}$$
(4-16)

c) In the positive slewing phase, if V<sub>2,dm</sub> is sufficiently large to make both M3 and M5 work in the triode region, the common-mode output voltage is then calculated as (4-17). The on resistance of M3 and M5, R<sub>on3</sub> and R<sub>on5</sub>, are typically much smaller than the resistor R in the low power design. Therefore, V<sub>2</sub>,cm is about I<sub>tail</sub>\*R. In addition, V<sub>2-</sub> and V<sub>2+</sub> are calculated as (4-18) and (4-19). It's found that V<sub>2+</sub> linearly increases as I<sub>tail</sub>. Therefore, a



large transient I<sub>tail</sub> should be generated for the last preamp stage to maximize its output voltage swing and the entire op amp's slew rate. As the preamp is symmetry, the calculations of  $V_{2+}$  and  $V_{2-}$ , shown in (4-18) and (4-19) are swapped in the negative slewing phase.

$$V_{2,cm} = \frac{V_{2+} + V_{2-}}{2} = I_{tail} * (R + R_{on3} + R_{on5}) \approx I_{tail} * R$$
(4-17)

$$V_{2-} = I_{tail} * (R_{on3} + R_{on5})$$
(4-18)

$$V_{2+} = I_{tail} * (2R + R_{on3} + R_{on5})$$
(4-19)

When V<sub>2-</sub> and V<sub>2+</sub> become V<sub>supply</sub> and 0 respectively in the negative slewing phase, the op amp's output slewing current can be easily calculated as (4-20), if M9C works in the saturation region in this phase. Under the reasonable assumptions that  $\beta_9=\beta_{10}$ ,  $\beta_{10C}=\beta_{9C}$  and M9 works in the triode region in the op amp's positive slewing phase, the gate voltage of M10, V<sub>g10</sub>, is found as about  $\frac{2}{3}$  (V<sub>supply</sub> – V<sub>th10</sub>) after solving the KCL equations at M10's gate node. If M10C works in the saturation region in the positive slewing phase and its threshold voltage is the same as M9C, then the positive slewing current, I<sub>SR+</sub> can be simplified to be about  $-\frac{4}{9}$  I<sub>SR-</sub> as shown in (4-21). Therefore, the transistors sizes of the op amp's output stage should be designed according to the op amp's slew rate specifications given by equations (4-20) and (4-21).

$$I_{SR-} = -\frac{1}{2}\beta_{9C} (V_{supply} - V_{th9C})^2$$
(4-20)

$$I_{SR+} = \frac{1}{2}\beta_{17} (V_{supply} - V_7 - V_{th17})^2 = \frac{1}{2} * \frac{4}{9}\beta_{17} (V_{supply} - V_{th17})^2 = -\frac{4}{9}I_{SR-}$$
(4-21)



## 4.5.1.2 Small-signal Analysis of the Preamp Stage

The DC gain of the preamp shown in Figure 4.5(a) is annotated as  $A_{vo}$  and calculated as (4-22). Assuming that  $f_T$  of M1 and M3 are the same for simplicity and this preamp's loading circuit is another same preamp, this preamp's pole,  $P_{nd}$ , is found as (4-23), where  $C_L$  is given by (4-24). Thus, the GBW of the preamp, GBW<sub>preamp</sub>, is derived as (4-25). When an op amp drives a very large capacitive load and its dominant pole is located at the op amp's output node, the amount of GBW enhancement and the amount of DC gain enhancement generated by the added preamp stages are the same. In this regard, when the N-stage of the preamps in Figure 4.5(a) are cascaded prior to the op amp's output stage, the GBW of the op amp, GBW<sub>enh</sub>, is found as (4-26), in which GBW<sub>orig</sub> is the GBW of the output stage of the op amp without any preamp stages. The phase drop caused by the poles in the N-stage preamps can be calculated and simplified as (4-27) after plugging (4-26) into (4-27). In order to have a phase margin more than 63 degrees for the op amp,  $\phi_{drop}$  needs to be less than 27° or 0.463rad. This phase margin requirement imposes the requirement of GBW<sub>preamp</sub>/GBW<sub>orig</sub> as shown in (4-28).

$$A_{V0} = (g_{m1} + g_{m3}) * R$$
(4-22)

$$p_{nd} = \frac{1}{R(C_{gs1} + C_{gs3} + C_L)} = \frac{(g_{m1} + g_{m3})}{A_{V0}(C_{gs1} + C_{gs3})(1+m)} \approx \frac{f_T}{A_{V0}(1+m)}$$
(4-23)

$$C_{L} = m * (C_{gs1} + C_{gs3}) = C_{gd1} + C_{gd3} + C_{ds1} + C_{db1} + C_{ds3} + C_{db3}$$
(4-24)

$$GBW_{preamp} = \frac{g_{m1} + g_{m3}}{(C_{gs1} + C_{gs3})(1+m)} \approx \frac{f_T}{1+m}$$
(4-25)

$$GBW_{enh} = A_{V0}{}^{N}GBW_{orig}$$
(4-26)

$$\emptyset_{\rm drop} = N \tan^{-1} \left[ \frac{GBW_{\rm enh}}{p_{\rm nd}} \right] \approx \frac{N * GBW_{\rm enh}}{p_{\rm nd}} \approx \frac{NA_{\rm V0}^{\rm N+1} * GBW_{\rm orig}}{GBW_{\rm preamp}} \le 0.46$$
(4-27)



$$\frac{\text{GBW}_{\text{preamp}}}{\text{GBW}_{\text{orig}}} \ge \frac{\text{A}_{\text{V0}}^{N+1} * \text{N}}{0.46}$$
(4-28)

#### 4.5.1.3 GBW Enhancement Optimization for N-stage of Preamp in Cascade

For a given total current budget, I<sub>budget</sub>, for N identical preamp stages, the current budget is equally distributed to N preamp stages, where N can range from 1 to any other positive integer number. We define the GBW of the preamp as GBW<sub>preamp\_single</sub> when I<sub>budget</sub> is entirely consumed by this single preamp. Assuming the transistors in the preamps are working in the weak inversion region, scaling down the transistors' bias current to I<sub>budget</sub>/N without changing the size of transistors shrinks the preamp's GBW by N times to GBW<sub>preamp\_single</sub>/N. Therefore, the ratio of GBW<sub>preamp\_single</sub> to GWB<sub>orig</sub>, GBW<sub>ratio</sub>, is derived as (4-29) after plugging (4-28). The GBW<sub>ratio</sub> depends on process feature sizes, bias current and load capacitor etc. Different GBW<sub>ratio</sub> may result in different optimal preamp stages and optimal GBW enhancement factors.

$$GBW_{ratio} = \frac{GBW_{preamp\_single}}{GBW_{orig}} = \frac{N * GBW_{preamp}}{GBW_{orig}} \ge A_{V0}^{N+1} * \frac{N^2}{0.46}$$
(4-29)

Figure 4.6 shows the dependency of GBW enhancement factor,  $A_{Vo}^{N}$ , on the quantity of preamp stages at different GBW<sub>ratio</sub>. The peak of  $A_{Vo}^{N}$  shifts to upper right portion of the plot as the GBW<sub>ratio</sub> increases. This means that more preamp stages are needed to achieve optimal GBW enhancement factors as GBW<sub>ratio</sub> increases. For example, the optimal number of preamp stages for GBW<sub>ratio</sub> of 8\*10<sup>5</sup> and 5\*10<sup>4</sup> are respectively four and three. Increasing the load capacitor or the preamp's bias current or using a smaller transistor size will enhance GBW<sub>ratio</sub>. In this design with the 0.18um CMOS process, I<sub>budget</sub> for the preamp stages is about 5uA and CL=15nF, the GBW ratio is about 2\*10<sup>5</sup>. Figure 4.5 shows that the optimal number of the preamp stage is 3~4 for this design and the largest GBW enhancement factor is about 1000.



The DC gain of each preamp stage should be about 10 for a 3-stage preamp and 5.6 for a 4-stage preamp. In this work, we design a 4-stage preamp with a gain of 5.8.



Figure 4.6: Dependency of GBW enhancement factor on number of preamp stages

## 4.5.2 Design of the entire op amp

Figure 4.7 shows the schematic of the designed op amp. The adaptive bias circuit of the input pair is shown in Figure 4.8. The designed op amp consists of a class AB input stage, three V-V preamps and a class AB output stage. The adaptive biasing circuit for the input pairs is controlled by negative feedback loops formed by M1 and M20~M22. The adaptive biasing circuit regulates transistor M1's source voltage so that it tracks its gate voltage. Because of this, the input pairs M1A, M1B, M1C and M2D have a class-AB operation with effectively 2 times of the input small signals. Due to the class-AB operation, when a large step input signal ( $V_{id} = V_{ip}$ - $V_{im} >>$ Vod1) is applied in the slewing phase, large transient drain currents of transistors M1A and M1C will be provided by transistor M22A, whereas M1B and M1D and



their current mirrors have small currents. As a result, the transient current in M1C is much larger than M3B and this excessive transient current from M1C activates M0B. When the large step input signals are removed, transistor M0B resumes to the off state. Transistor M0B's off stage is automatically resumed because the drain source voltages of M3A and M3B are biased to be lower than the threshold voltages of M0A and M0B in the quiescent operation. Similarly, when a large negative input signal is applied, transistor M0A will be activated. Therefore, whenever there is a large transient input step signal, the total current in transistors M0A and M0B increases. The total current in M0A and M0B is mirrored and gained up to the last preamp stage's tail current by transistors M11A and M11B. The largely boosted tail current increases the output voltage swing of the last preamp stage, as given by (4-18) and (4-19). This largely improves the slew rates.



Figure 4.7: Schematic of the designed op amp for driving 15nF load capacitor





Figure 4.8: The adaptive bias circuit for the designed op amp's input stage In the negative slewing phase, transistors M17 and M15 respectively work in the cutoff and triode regions. The transient drain current of M15 is derived as  $I_{15}\approx\beta_{15}V_{in\_avg}(V_{gs15}-V_{th15}-0.5*V_{in\_avg})\approx22.7mA$ , because  $V_{gs15}\approx V_{supply}-50mV=1.45V$ ,  $\beta_{15}=\mu C_{ox}W_{15}/L_{15}=48.5mA/V^2$ ,  $V_{th15}\approx0.46V$  and  $V_{in\_avg}=0.78V$  in this design. Consequently, the expected negative slew rate (SR-) of the designed op amp is around  $I_{15}/C_L=1.5V/\mu s$  with  $C_L$  of 15nF. In the positive slewing phase, transistors M14 and M15 work in the triode and cutoff regions with  $V_{gs14}\approx V_{supply}-50mV$ =1.45V. M16 still works in the saturation region because of its diode connection. Therefore, the KCL equation at the drain of M14 and M16 can be expressed as (4-30). After plugging  $\lambda_{16}=0.25$ ,  $\beta_{16}=12$ ,  $\beta_{14}=13.9$ ,  $V_{th16}=0.53V$ ,  $V_{th14}=0.46V$  and  $V_{gs14}=V_{gs15}=1.45V$  into (4-30),  $V_{g16}$ is found as 0.28V and the drain current of M16 and M14 is found as 3.3mA. Since the aspect ratio of transistors M17 to M16 is 14/4, the expected drain current of M17 is about 3.3mA\*14/4=11.6mA and the expected positive slew rate (SR+) is about 0.77V/\mus in the positive slewing phase.

$$\frac{1}{2}\beta_{16} (V_{\text{supply}} - V_{\text{g16}} - V_{\text{th16}})^2 [1 + \lambda_{16} (V_{\text{supply}} - V_{\text{g16}} - V_{\text{th16}})] = \beta_{14} (V_{\text{gs14}} - V_{\text{th14}}) (V_{\text{gs14}} - V_{\text{th14}} - 0.5 * V_{\text{g16}})$$
(4-30)



In terms of the op amp's DC operation points, transistors M9A-M9D and M10A-M10B are respectively defined to have the same operation conditions as transistors M5A-M5B and M6A-M6B, because they have the same gate voltages and same current densities. Therefore, M9A-M9D and M5A-M5B work in the saturation region, whereas M10A-M10B and M6A-M6B work in the triode region. As for the last preamp stage, because the gate voltage of transistor M13 is used to define the output stage's current, transistor M13 is designed to work in the saturation region. This can be achieved by lowering the gate source voltage of transistors M9E and M9F. Therefore, each circuit branch in the designed op amp has a well-defined quiescent current and any common mode voltage errors from prior preamp stages will not proceed to the output voltage of subsequent preamp stages.

In addition, a feedforward connection, from the 2<sup>nd</sup> preamp's outputs to the NMOS input pair of the last preamp stage, is used to reduce the total gain of the preamp stages. This feedforward connection increases the op amp's input linear range. An input linear range that is too narrow could cause conditional instability in large-signal operation of multi-stage op amps [10].

To understand the frequency response of the entire op amp, the frequency response of the adaptive bias circuit in Figure 4.8 is analyzed first. After a differential input voltage applied at  $V_{im}$  of -0.5\* $V_{in}$  and  $V_{ip}$  of 0.5\* $V_{in}$ , the KCL equations at nodes  $V_{im2}$ ,  $V_{x-}$  and  $V_{y-}$  can be found as (4-31) to (4-33), where  $g_x=g_{ds1}+g_{ds21}+g_{ds20}$ ,  $g_y=g_{ds21}$  and  $g_z=2g_{ds1}+g_{ds22}$ . After solving (4-31) to (4-33), the transfer function of  $(V_{ip}-V_{im2})/V_{ip}$  is found as (4-34), where a, b, c and d are expressed in (4-35) to (4-38). In order to obtain more insights from the equations, the parasitic capacitances (ie.  $C_{gs}$  and  $C_{gd}$ ) and  $f_T$  of transistors M1, M20-M22 are assumed to be close to each other for simplicity. Therefore, the expressions of a, b, c and d are approximated as  $2/f_T$ ,



 $4/f_{T}$ ,  $10/f_{T}^{2}$  and  $5/f_{T}^{3}$ , where  $f_{T}$  of the transistors are in the order of 100MHz. With the approximated a, b, c and d, it can be found that the frequencies of the three LHP poles and three zeros in (4-34) are much higher than the GBW of the designed op amp, which is about 0.85MHz. Therefore, for simplicity, the transfer function of  $(V_{ip}-V_{im2})/V_{ip}$  and  $(V_{im}-V_{ip2})/V_{im}$  is approximated as 2 in the following frequency analysis.

$$V_{im2} * [2(g_{m1} + s * C_{gs1}) + sC_{gd22} + g_z] + V_y * (g_{m22} - sC_{gd22}) = 0$$
(4-31)

$$-g_{m1}\left(\frac{V_{in}}{2} + V_{im2}\right) + \frac{V_{in}}{2}sC_{gd1} + V_x(sC_{gs21} + sC_{gd1} + g_x + g_{m21}) - g_{ds21}V_y$$
(4-32)  
= 0

$$-g_{m21}V_x + V_y * (g_y + sC_{gs22} + sC_{gd22}) - V_{im2} * sC_{gd22} = 0$$
(4-33)

$$\frac{V_{ip} - V_{im2}}{V_{ip}} = \frac{V_{im} - V_{ip2}}{V_{im}} = \frac{2 + as + cs^2 + ds^3}{1 + bs + cs^2 + ds^3}$$
(4-34)

$$a = \frac{(2C_{gs22}g_{m1} + C_{gd22}g_{m22} - C_{gd1}g_{m22})}{g_{m1}g_{m22}} \approx \frac{2}{f_{T}}$$
(4-35)

$$b = \frac{(2C_{gs22}g_{m1} + C_{gd22}g_{m22} + C_{gd22}g_{m1})}{g_{m1}g_{m22}} \approx \frac{4}{f_T}$$
(4-36)

$$c = \frac{C_{gd22}[(2C_{gs1} + C_{gs22})g_{m21} + C_{gs21}(2g_{m1} + g_{m22})]}{g_{m1}g_{m21}g_{m22}} + \frac{2C_{gs22}(C_{gs21}g_{m1} + C_{gs1}g_{m21})}{g_{m1}g_{m21}g_{m22}} \approx \frac{10}{f_{T}^{2}}$$
(4-37)

$$d = \frac{C_{gs21}(2C_{gd22}C_{gs1} + C_{gd22}C_{gs22} + 2C_{gs1}C_{gs22})}{g_{m1}g_{m21}g_{m22}} = \frac{5}{f_T^3}$$
(4-38)

With the transfer function of  $(V_{ip}-V_{im2})/V_{ip}$  and  $(V_{im}-V_{ip2})/V_{im}$  known as 2, the small-signal block diagram of the designed op amp's input stage can be simplified as Figure 4.9. The gain of 2 is expressed by changing the input signal from  $0.5*V_{in}$  to  $V_{in}$ . Three KCL equations, expressed as (4-39) to (4-41), are calculated for nodes  $V_1$ ,  $V_2$ , and  $V_3$ . After solving the equations, the transfer function from  $V_{in}$  to  $V_3$ ,  $TF_1(s)$ , is derived as (4-42), in which the time



constants are expressed in (4-43). The C<sub>i</sub> and g<sub>i</sub> in (4-43) are respectively the capacitance and conductance at node i and their expressions are shown in Table 4.1. As expected, there are three LHP poles and one LHP zero in TF<sub>1</sub>(s) and the DC gain of TF<sub>1</sub>(s) is A<sub>1</sub>=  $k*g_{m1}*R_3=3.5$  g<sub>m1</sub>\*R<sub>3</sub>=6.7.



Figure 4.9: The small-signal block diagram of the op amp's input stage

 $-g_{m1} * V_{in} + V_1 * (s * C_1 + g_{m2}) = 0$ (4-39)

$$g_{m1} * V_{in} + V_1 * g_{m3} + V_2(s * C_2 + g_{m4}) = 0$$
(4-40)

$$V_2 * g_{m4} = V_3 * (s * C_3 + 1/R_3)$$
(4-41)

$$TF_{1}(s) = \frac{V_{3}}{V_{in}} \approx \frac{-g_{m1}R_{3} * k(1 + s\tau_{z1})}{(1 + s\tau_{1})(1 + s\tau_{2})(1 + s\tau_{3})}; \ k = \frac{g_{m2} + g_{m3}}{g_{m2}} = 3.5$$
(4-42)

$$\tau_1 = \frac{C_1}{g_{m2}} = 11 \text{ns}$$
,  $\tau_2 = \frac{C_2}{g_{m4}} = 9.6 \text{ns}$ ,  $\tau_3 = \frac{C_3}{g_3} = 6.7 \text{ns} \tau_{z1} = \frac{C_1}{\text{kg}_{m2}} = 3.2 \text{ns}$  (4-43)

#### Table 4.1: Expressions of parasitic capacitance for the op amp's input stage

| Expression                                                                                           |
|------------------------------------------------------------------------------------------------------|
| $C_1 \approx Cgs2 + C_{gs3} + C_{db1} + C_{db2} + C_{gd3+}C_{gd1}$                                   |
| $C_2 \approx C_{db3} + C_{db1} + C_{db4} + C_{gs4} + C_{gs0} + C_{gd3} + C_{gd1}$                    |
| $C_3 \approx C_{db4} + C_{db5} + C_{gd4} + C_{gd5} + (C_{gd8} + C_{gd9})^* (g_{m8} + g_{m9}) R_2$    |
| $C_4 \approx C_{gd8} + C_{gd9} + (C_{gd8} + C_{gd9})^* (g_{m8} + g_{m9}) R_3 + C_{gd9}^* g_{m9} R_4$ |
| $C_5 \approx C_{gd8} + C_{gd9} + C_{gd9} * g_{m8}R_4$                                                |
| $C_{6p} \approx C_{gd8} + C_{gd9} + C_{gs14} + C_{gd14}$                                             |
| $C_{6m} \approx C_{gd8} + C_{gd9} + C_{gs15} + C_{gd15}$                                             |
| $C_{6t} = C_{6p} + C_{6m} \approx C_{6p} (1 + g_{m15}/g_{m14}) = 4.5 * C_{6p}$                       |
| $C_7 \approx C_{gs16} + C_{gs17} + C_{gd17} + C_{gd14}$                                              |
| $g_3=1/R_3, g_4\approx 1/R_4, g_5\approx 1/R_5, g_6\approx 1/R_6, g_L=g_{ds17}+g_{ds15}$             |





Figure 4.10: Small-signal block diagram of the designed op amp from its 1<sup>st</sup> preamp stage to its output stage

To analyze the complete transfer function from the input to the output of the op amp, the small-signal block diagram from the op amp's 1<sup>st</sup> preamp stage output to its output stage is drawn as Figure 4.10. In the block diagram,  $g_{m8}=g_{m9}$  is used for simplicity. The KCL equations at nodes V<sub>4</sub>, V<sub>5</sub>, V<sub>6p</sub>, V<sub>6m</sub>, V<sub>7</sub> and V<sub>out</sub> are calculated as (4-44) to (4-49). After solving the equations, the transfer function from V<sub>3</sub> to V<sub>out</sub> is derived as (4-50), where A<sub>2</sub> and A<sub>3</sub> are respectively the three preamp stages' total DC gain and the output stage's DC gain. Therefore, the DC gain from V<sub>in</sub> to V<sub>6</sub> can be found as A<sub>1\*</sub>A<sub>2</sub>=  $3.5g_{m1}2g_{m8}^2(1/R_5 + 2g_{m8}) * R_3R_4R_5R_6=1072$ , in which A<sub>1</sub>=6.7 and A<sub>2</sub>=160. The transfer function from the op amp input to V<sub>3</sub> of the op amp is also recalled as (4-51) from (4-42). The values of the time constants in (4-49) and (4-50) are calculated as (4-51) to (4-55). In addition, the expressions of g<sub>i</sub> and C<sub>i</sub> are expressed in Table 4.1. Therefore, the op amp's transfer function from its input to output is calculated as TF<sub>1</sub>(s)\* TF<sub>2</sub>(s), which consequently has four LHP zeros and nine LHP poles. The distribution of the designed op amp's poles and zeros within 5 times of the GBW of the op amp are shown in Figure 4.11, in which P<sub>-3dB</sub>=g<sub>1</sub>/C<sub>L</sub>, P<sub>i</sub>=1/τ<sub>i</sub>, Z<sub>j</sub>=1/τ<sub>i</sub>, i=1~5, 6m, 6p and



 $j=1\sim4$ . With the locations of the poles and zeros, the phase margin of the designed op amp is 62.5° with a GBW of 0.85MHz.

$$V_3 * 2g_{m8} + V_4(g_4 + sC_4) = 0$$
(4-44)

$$V_4 * 2g_{m8} + V_5 * (g_5 + sC_5) = 0$$
(4-45)

$$-V_4 * 0.5g_{m8} + V_5 * 0.5g_{m8} + V_{6p} * (g_6 + sC_{6p}) = 0$$
(4-46)

$$V_4 * 0.5g_{m8} - V_5 * 0.5g_{m8} + V_{6m}(g_6 + sC_{6m}) = 0$$
(4-47)

$$V_{6p} * g_{m14} + V_7(g_{m16} + sC_7) = 0$$
(4-48)

$$V_7 * g_{m17} + V_{6m} * g_{m15} + V_{out} * (g_L + sC_L) = 0$$
(4-49)

$$TF_{2}(s) = \frac{V_{o}}{V_{3}} \approx \frac{-A_{2}A_{3}(1+s\tau_{z2})[1+s\tau_{z3}][1+s\tau_{z4}]}{(1+\tau_{4}s)(1+\tau_{5}s)(1+\tau_{6m}s)(1+\tau_{6p}s)(1+\tau_{7}s)(1+\frac{C_{L}}{g_{L}}s)}$$
(4-50)

$$TF_{1}(s) = \frac{V_{3}}{V_{in}} \approx \frac{-A_{1}(1 + s\tau_{z1})}{(1 + s\tau_{1})(1 + s\tau_{2})(1 + s\tau_{3})}$$
(4-51)

$$A_1 = 3.5g_{m1}R_3, A_2 = \frac{2g_{m8}^2(g_5 + 2g_{m8})}{g_4g_5g_6} = 160, A_3 = \frac{g_{m15}}{g_L}$$
 (4-52)

$$\tau_1 = \frac{C_1}{g_{m2}} = 11 \text{ns}$$
,  $\tau_2 = \frac{C_2}{g_{m4}} = 9.6 \text{ns}$ ,  $\tau_3 = \frac{C_3}{g_3} = 6.7 \text{ns}$ ,  $\tau_4 = \frac{C_4}{g_4} = 7.8 \text{ns}$  (4-53)

$$\tau_5 = \frac{C_5}{g_5} = 5.6 \text{ns}, \tau_{6p} = \frac{C_{6p}}{g_6} = 20 \text{ns}, \tau_{6m} = \frac{C_{6m}}{g_6} = 52.4 \text{ns}$$
 (4-54)

$$\tau_7 = \frac{C_7}{g_{m16}} = 22.7 \text{ns}, \tau_{z1} = \frac{C_1}{\text{kg}_{m2}} = 3.2 \text{ns}; \ \tau_{z2} = \frac{C_5}{g_5 + 2g_{m8}} = 0.83 \text{ns},$$
 (4-55)

$$\tau_{z3} = \frac{1}{2} \left( \frac{C_{6t}}{g_6} + \frac{C_7}{g_{m16}} \right) = 36.2 \text{ns}, \tau_{z4} = \frac{C_{6p}C_7}{C_{6t}g_{m16} + C_7g_6} = 4.8 \text{ns}$$
(4-56)

$$GBW = A_1 * A_2 * \frac{g_{m15}}{C_L} = 3.5g_{m1}2g_{m8}^2 \left(\frac{1}{R_5} + 2g_{m8}\right) * R_3 R_4 R_5 R_6 * \frac{g_{m15}}{C_L}$$
(4-57)





Figure 4.11: Distribution of the op amp's poles and zeros within 5 times of GBW Compared with the GBW of the op amp without preamp stages,  $g_{m15}/C_L$ , the op amp's GBW with the preamp stages is enhanced by  $A_{1*}A_{2}=1072$  times, as shown by (4-57). From (4-57), it also can be seen that the GBW of the op amp is approximately proportional to  $gm^5$  and  $R^4$ . As the resistor's process variation in this 180nm CMOS process is about -50%~+45% of its typical value, the variation of the  $\mathbb{R}^4$  and GBW can be as high as 6.25%~442%. To reduce this variation, a constant gm bias circuit like [11] is used as this op amp's bias circuit. The constant gm bias circuit makes the NMOS transistors' gm proportional to 1/R. As a result, A1\*A2 is approximately a constant. The op amp's GBW can thus be simplified as proportional only to the transistor's gm. Therefore, the expected GBW variation ranges from 70% to 200% of the typical GBW. This GBW variation can be further reduced by trimming the resistors in the preamp stages or trimming the transistor sizes at the op amp's output stage or implementing the resistors by transistors. In addition, the generated bias current from the constant gm bias circuit is also roughly proportional to 1/R, so the expected quiescent current of the op amp also ranges from 70% to 200% of the typical value under process corner variations. Because the GBW and the supply current, I<sub>supply</sub>, of the op amp have similar dependencies on the resistor's variation, the ratio of GBW and I<sub>supply</sub> should have less variation. As a result, the variation of FOMs=GBW\*C<sub>L</sub>/I<sub>supply</sub> is not very large and its value need to be confirmed by simulation. On the other hand, some foundries closely monitor the doping concentrations of the poly and whole wafer as their standard procedure. As a result, the poly resistor values are almost always



close to the values in the typical corner. If this is the scenario, no constant gm bias circuit is needed.

#### **4.6.Simulation Results**

In this section, the designed op amp in a CMOS 180nm process is simulated under three different conditions: 1) under process corner variations only, 2) under mismatch variations only, and 3) under process corner plus mismatch variations. The purposes of the simulation results are fourfold: a) to confirm that the quiescent current of the op amp is well controlled; b) to verify the theoretical analysis of the op amps' frequency and transient response including phase margin and slew rate in the typical corner; c) to confirm that the designed op amp provides favorable small- and large-signal figures of merit, FOM<sub>s</sub> and FOM<sub>L</sub>, compared with the state-of-the-art op amp design [3][9] for driving large capacitive loads. FOM<sub>s</sub> and FOM<sub>L</sub> are defined as (4-56), where GBW, C<sub>L</sub>, SR and I<sub>supply</sub> are respectively the gain-bandwidth product, load capacitor, slew rate and supply current of the op amp.

$$FOM_s = \frac{GBW * C_L}{I_{supply}}$$
;  $FOM_L = \frac{SR * C_L}{I_{supply}}$  (4-58)





79

## 4.6.1 Typical corner simulation results

Figure 4.12: Frequency response of the designed op amp at typical corner Figure 4.12 shows the frequency response of the designed op amp with a 15nF load capacitor. The simulated DC gain, GBW and phase margin (PM) are respectively 92.6dB, 0.85MHz and 62.5°. The simulated phase margin agrees with the theoretical calculation in Section 4.5.2. Also, as expected, the op amp has a dominant pole at low frequency, about 10Hz, and all other nondominant poles are located at frequencies a few times higher than the op amp's GBW.





Figure 4.13: Transient response of the designed op amp at typical corner The designed op amp's large- and small-signal transient responses are simulated in the noninverting unity gain buffer configuration with input step voltages of 400mV and 60mV. The simulated transient responses are shown in Figure 4.13. As can be seen, the positive slew rate (SR+) is  $0.77V/\mu s$  and the negative slew rate (SR-) is  $-1.4V/\mu s$ . The simulated SR+ and SR- are also consistent with the calculated slew rates. With a positive large step input, the op amp's settling time with 1% (Ts+\_1%) and 0.1% (Ts+\_0.1%) settling accuracy are respectively 0.85 $\mu s$  and 1.06 $\mu s$ . With a negative large step input, the op amp's settling time with 1% (Ts-\_1%) and 0.1% (Ts-\_0.1%) settling accuracy are respectively 0.613 $\mu s$  and 0.801 $\mu s$ . The quiescent supply current and supply voltage of the op amp are respectively 6.56 $\mu A$  and 1.5V. Therefore, FOM<sub>s</sub> and FOM<sub>L</sub> are derived as 1940pF/MHz-uA and 2500pF\*V/us-uA. Also, the op amp's input referred voltage noise density is found as  $1.5\mu V/sqrt(Hz)$  and 93.6nV/sqrt(Hz) at 100Hz and 100KHz respectively. In addition, the op amp's power supply rejection ratios (PSRR) are -93.2dB at frequency of 1KHz and 91.2dB at frequency of 100KHz. The performance summary of the designed op amp in the typical corner is shown in Table. 4.2.



| Output         | Unit        | Тур      |
|----------------|-------------|----------|
| Phase Margin   | degree      | 62.5     |
| GBW            | MHz         | 0.846    |
| DC gain        | dB          | 92.64    |
| Isupply        | μA          | 6.56     |
| Vos, 1o        | mV          | -0.00958 |
| SR-            | V/µs        | -1.41    |
| SR+            | V/µs        | 0.778    |
| Ts1%           | μs          | 0.613    |
| Ts+_1%         | μs          | 0.852    |
| Ts+_0.1%       | μs          | 0.801    |
| Ts+_0.1%       | μs          | 1.06     |
| FOMs           | pF/MHz-uA   | 1940     |
| FOML           | pF*V/us-uA  | 2500     |
| noise_at_100Hz | nV/sqrt(Hz) | 1510     |
| noise_at_100K  | nV/sqrt(Hz) | 93.6     |
| PSRR at 1KHz   | dB          | -93.2    |
| PSRR at 100KHz | dB          | -91.23   |

Table 4.2: Performance summary of the designed op amp in the typical corner

#### **4.6.2 Process corner variation simulation results**

In this section, the designed op amp is simulated under process corner variations. The purposes of the simulations are threefold: a) to verify the functionality of the designed op amp under process corner variations; b) to check the variations of the op amp's GBW, PM, DC gain, slew rate and settling time under process corner variations; and c) to confirm the robustness of the op amp's FOMs and FOM<sub>L</sub> under process corner variations. The process corner setup is shown in Table 4.3.





Figure 4.14: Frequency responses of the designed op amp at all process corners Table 4.3: Process corner setups for the simulations of the designed op amp

| Parameter | typ  | All0 | All1 | All2 | All3 | All4 | All5 | All6 | All7 | low  | high |
|-----------|------|------|------|------|------|------|------|------|------|------|------|
| Capacitor | typ  | low  | high |
| MOSFET    | tntp | hnlp | lnhp | snsp | wnwp | hnlp | lnhp | snsp | wnwp | wnwp | snsp |
| Resistor  | typ  | high | high | high | high | low  | low  | low  | low  | low  | high |

Figure 4.14 shows the designed op amp's frequency response under process corner variations. The performance pairs of (min, typ, max) of the simulated DC gain, phase margin (PM), GBW and supply current ( $I_{supply}$ ) are respectively (89.7dB, 92.64dB, 93.3dB), (59.7°, 62.5°, 69.9°), (0.43MHz, 0.846MHz, 1.49MHz) and (4.54 $\mu$ A, 6.56  $\mu$ A, 12.4 $\mu$ A). The ranges



of the GBW and supply current are respectively within  $51\% \sim 176\%$  and  $69\% \sim 189\%$  of their typical values. These match reasonably well with the calculated ranges of  $70\% \sim 200\%$  and  $70\% \sim 200\%$ . The (min, typ, max) of the simulated FOMs are (1310 pF/MHz- $\mu$ A, 1940 pF/MHz- $\mu$ A, 1940 pF/MHz- $\mu$ A). As expected, the variation range of FOMs,  $68\% \sim 100\%$ , is smaller than that of GBW or I<sub>supply</sub> because the GBW and I<sub>supply</sub> of the op amp have similar dependencies on resistors' variation.



Figure 4.15: Transient step responses of the designed op amp at all process corners The designed op amp's large- and small-signal transient responses are simulated in the noninverting unity gain buffer configuration with input step voltages of 400mV and 60mV under process corner variations. The simulated transient performance of the op amp is shown in Figure 4.15. The (min, typ, max) of the simulated Ts+\_1%, Ts-\_1%, Ts-\_0.1% and Ts+\_0.1% are respectively (0.52µs, 0.85µs, 1.17µs), (0.44µs, 0.61µs, 0.71µs), (0.65µs, 1.06 µs, 1.46µs) and (0.56µs, 0.80µs, 1.0µs). The (min, typ, max) of the simulated SR+ and SR- are (0.65V/µs, 0.78 V/µs, 0.95V/µs) and (-1.23V/µs, -1.41V/µs, -1.64V/µs). As a result, the (min, typ, max) of FOM<sub>L</sub> is (1160pF\*V/us-µA, 2500pF\*V/us-µA, 3860pF\*V/us-µA). The FOM<sub>L</sub> of the designed op amp excels [3][9]. The performance summary of the designed op amp under process corner variations is shown in Table 4.4.



| Output         | Unit            | Min    | Max    | Тур    |
|----------------|-----------------|--------|--------|--------|
| Phase Margin   | degree          | 59.73  | 69.92  | 62.5   |
| GBW            | MHz             | 0.434  | 1.49   | 0.846  |
| DC gain        | dB              | 89.71  | 93.32  | 92.64  |
| Isupply        | μA              | 4.54   | 12.4   | 6.56   |
| SR-            | V/µs            | -1.23  | -1.64  | -1.41  |
| SR+            | V/µs            | 0.645  | 0.945  | 0.778  |
| Ts1%           | μs              | 0.444  | 0.713  | 0.613  |
| Ts+_1%         | μs              | 0.525  | 1.17   | 0.852  |
| Ts0.1%         | μs              | 0.563  | 1      | 0.801  |
| Ts+_0.1%       | μs              | 0.654  | 1.46   | 1.06   |
| FOMs           | pF/MHz-µA       | 1310   | 1940   | 1940   |
| FOML           | $pF*V/us-\mu A$ | 1160   | 3860   | 2500   |
| Noise_at_100Hz | nV/sqrt(Hz)     | 1070   | 3380   | 1510   |
| Noise_at_100K  | nV/sqrt(Hz)     | 72.5   | 128    | 93.6   |
| PSRR at 1KHz   | dB              | -120.4 | -90.42 | -93.2  |
| PSRR at 100KHz | dB              | -110.5 | -86.05 | -91.23 |

Table 4.4: Performance sumamry of the designed op amp under process corner variation

#### **4.6.3** Mismatch variation simulation results

In this section, the designed op amp is simulated using a 1000-run Monte Carlo simulation with mismatch variations only. The purposes of the simulations are twofold: a) to verify that the op amp's quiescent current is well controlled; and b) to verify the tight spread of the op amp's performance including GBW, PM, DC Gain, slew rate and settling time. Figure 4.16 shows the frequency responses of the designed op amp under mismatch variations. The 1000-run Monte Carlo simulation shows that the pairs of (mean, sigma) of the simulated PM, DC gain, GBW and supply current of the designed op amp are respectively (62.5°, 2.5°), (92.5dB, 0.7dB), (0.83MHz, 0.08MHz) and (6.56µA, 0.19µA). The tight spread of the op amp's quiescent



current is well defined. Therefore, unlike [3], the designed op amp's small-signal performance is robust under random mismatches.



Figure 4.16: Frequency responses of the designed op amp under mismatch variation The simulated transient responses of the op amp are shown in Figure 4.17. As can be seen, the op amp always settles to its final steady-state voltage after a certain period. The final steady-state voltage slightly varies due to the op amp's random offset voltages. The offset voltages of the op amp have a normal distribution with a mean of 0.03mV and a sigma of 2.8mV. The performance pairs of (mean, sigma) of the simulated SR- and SR+ are respectively (-1.4V/µs, 0.01V/µs) and (0.78V/µs, 0.003V/µs). Similarly, the (mean, sigma) of Ts+\_1%, Ts-\_1%, Ts+\_0.1%, and Ts-\_0.1% are found as (0.84µs, 0.04µs), (0.58µs, 0.08µs), (1.06µs,



 $0.04\mu$ s) and  $(0.77\mu$ s,  $0.1\mu$ s). The very tight spread of the slew rate and settling time of the designed op amp confirms the robustness of the op amp's large-signal performance under random mismatch variations.



Figure 4.17: Transient responses of the the designed op amp under mismatch variation



Figure 4.18: FOMs of the designed op amp under mismatch variation

The histograms of the FOMs and FOM<sub>L</sub> of the op amp are shown in Figure 4.18 and Figure 4.19. FOMs has a normal distribution with a mean and sigma of 1904pF/MHz-uA and 140pF/MHz-uA respectively, whereas the mean and sigma of the FOM<sub>L</sub> are respectively 2501pF\*V/us-uA and 70pF\*V/us-uA. The narrow variations of the op amp's FOMs and FOM<sub>L</sub> again confirm the robustness of the designed op amp under mismatch variations. The performance summary of the design op amp under mismatch variations is shown in Table. 4.5.





Figure 4.19: FOM<sub>L</sub> of the designed op amp under mismatch variation

| Output         | Unit        | Min    | Max    | Mean   | Median | Std Dev |
|----------------|-------------|--------|--------|--------|--------|---------|
| Phase Margin   | degree      | 54.1   | 69.7   | 62.6   | 62.7   | 2.5     |
| GBW            | MHz         | 0.6    | 1.1    | 0.8    | 0.8    | 0.1     |
| DC gain        | dB          | 90.2   | 95.2   | 92.5   | 92.5   | 0.7     |
| Isupply        | μΑ          | 6.0    | 7.2    | 6.6    | 6.5    | 0.2     |
| Vos            | mV          | -9.7   | 10.6   | -0.1   | 0.0    | 2.8     |
| SR-            | V/µs        | -1.4   | -1.4   | -1.4   | -1.4   | 0.0     |
| SR+            | V/µs        | 0.8    | 0.8    | 0.8    | 0.8    | 0.0     |
| Ts1%           | μs          | 0.4    | 0.7    | 0.6    | 0.6    | 0.1     |
| Ts+_1%         | μs          | 0.6    | 0.9    | 0.8    | 0.8    | 0.0     |
| Ts+_0.1%       | μs          | 0.4    | 1.1    | 0.8    | 0.7    | 0.1     |
| Ts+_0.1%       | μs          | 0.8    | 1.2    | 1.1    | 1.1    | 0.0     |
| FOMs           | pF/MHz-uA   | 1497.0 | 2389.0 | 1904.0 | 1897.0 | 140.9   |
| FOML           | pF*V/us-uA  | 2297.0 | 2721.0 | 2501.0 | 2501.0 | 70.4    |
| Noise_at_100Hz | nV/sqrt(Hz) | 1448.0 | 1578.0 | 1509.0 | 1509.0 | 17.6    |
| Noise_at_100K  | nV/sqrt(Hz) | 89.5   | 97.7   | 93.7   | 93.7   | 1.4     |
| PSRR at 1KHz   | dB          | -124.8 | -79.2  | -91.4  | -90.4  | 6.6     |
| PSRR at 10KHz  | dB          | -116.8 | -69.1  | -82.8  | -82.0  | 6.3     |
| PSRR at 100KHz | dB          | -104.2 | -48.4  | -63.7  | -62.1  | 8.0     |

Table 4.5: Performance summary of the designed op amp under mismatch variation



In this section, the designed op amp is simulated under both process corner and mismatch (P.Mis) variations using a 1000-run Monte Carlo simulation. The purposes of these simulations are threefold: a) to verify the functionality of the designed op amp under P.Mis variations; b) to check the variations of the op amp's GBW, PM, DC gain, slew rate and settling time under P.Mis variations; and c) to confirm the robustness of the op amp's FOMs and FOM<sub>L</sub> under P.Mis variations. Figure 4.20 shows the simulated frequency responses of the designed op amp with a15nF load capacitor under P.Mis variations. The simulated PM, Gain, GBW, I<sub>supply</sub> and FOM<sub>s</sub> all have a normal distribution but with values of mean and sigma. Their (mean, sigma) are respectively (62.3°, 3.0°), (92.5dB, 1.1dB), (0.9MHz, 0.27MHz), (7.0µA, 1.87µA) and (1910 pF/MHz- $\mu$ A, 167 pF/MHz- $\mu$ A). As discussed, the variation in GBW is mainly caused by the process corner variation of the resistors. If a more constant GWB is desired, the resistors in either the 1<sup>st</sup> or the 2<sup>nd</sup> preamp of the op amp can be trimmed to obtain a constant GBW. As both GBW and I<sub>supply</sub> vary in a comparable way as the resistor value varies, the variation of GBW/I<sub>supply</sub> is much smaller than GBW or I<sub>supply</sub> alone. That's the reason why the normalized variation (sigma/mean) of FOMs is smaller than those of GBW or I<sub>supply</sub>. The histogram of FOMs is shown in Figure 4. 21.

The simulated transient responses of the designed op amp under P.Mis variations are shown in Figure 4.22. As can be seen, the op amp always settles to its final steady-state voltages after a certain period. The op amp's offset voltages show a normal distribution with a mean and sigma of 0.1mV and 2.8mV. The simulated  $Ts+_1\%$ ,  $Ts-_1\%$ ,  $Ts+_0.1\%$ ,  $Ts-_0.1\%$  have a normal distribution with (mean, sigma) of (0.82µs, 0.15µs), (1.03µs, 0.2µs), (0.56µs, 0.08µs) and (0.8µs, 0.2µs) respectively. The simulated SR- and SR+ have a normal distribution with



(mean, sigma) of  $(-1.4V/\mu s, 0.06V/\mu s)$  and  $(0.78V/\mu s, 0.08V/\mu s)$ . The FOM<sub>L</sub> also has a normal distribution with (mean, sigma) of (2474 pF\*V/us- $\mu$ A, 604.5 pF\*V/us- $\mu$ A). The spread of FOM<sub>L</sub> is mainly caused by variations in the supply current under process corner variations. The histogram of FOM<sub>L</sub> is shown in Figure 4.23. The performance summary of the designed op amp under P.Mis variations is shown in Table. 4.6.



Figure 4.20: Frequency responses of the designed op amp under P.Mis.variation





Figure 4.21: FOMs of the designed op amp under P.Mis.variation



Figure 4.22: Transient responses of the designed op amp under P.Mis variation



Figure 4.23: FOM<sub>L</sub> of the designed op amp under P.Mis variation



| Output         | Unit        | Min     | Max    | Mean   | Median | Std Dev |
|----------------|-------------|---------|--------|--------|--------|---------|
| Phase Margin   | degree      | 52.73   | 71.54  | 62.26  | 62.37  | 3.08    |
| GBW            | MHz         | 0.47    | 1.74   | 0.90   | 0.84   | 0.27    |
| DC gain        | dB          | 89.12   | 95.38  | 92.53  | 92.54  | 1.07    |
| Isupply_prop   | μA          | 4.30    | 12.29  | 7.03   | 6.64   | 1.88    |
| Vos            | mV          | -8.96   | 8.33   | 0.10   | 0.16   | 2.78    |
| SR-            | V/µs        | -1.57   | -1.27  | -1.40  | -1.39  | 0.06    |
| SR+            | V/µs        | 0.63    | 0.94   | 0.78   | 0.78   | 0.08    |
| Ts1%           | μs          | 0.36    | 0.84   | 0.56   | 0.56   | 0.09    |
| Ts+_1%         | μs          | 0.46    | 1.17   | 0.82   | 0.84   | 0.15    |
| Ts+_0.1%       | μs          | 0.40    | 1.72   | 0.80   | 0.75   | 0.20    |
| Ts+_0.1%       | μs          | 0.54    | 1.73   | 1.03   | 1.03   | 0.20    |
| FOMs           | pF/MHz-µA   | 1361    | 2465   | 1910   | 1902   | 167     |
| FOML           | pF*V/us-µA  | 1212    | 3900   | 2474   | 2486   | 605     |
| Noise_at_100Hz | nV/sqrt(Hz) | 1423    | 1846   | 1522   | 1520   | 42      |
| Noise_at_100K  | nV/sqrt(Hz) | 75.47   | 112.10 | 93.91  | 93.98  | 8.60    |
| PSRR at 1KHz   | dB          | -121.30 | -64.36 | -89.10 | -88.60 | 7.97    |
| PSRR at 10KHz  | dB          | -110.70 | -62.66 | -81.50 | -81.21 | 7.05    |
| PSRR at 100KHz | dB          | -104.80 | -44.71 | -63.24 | -61.78 | 9.09    |

Table 4.6: Performance summary of the designed op amp under P.Mis variation

## 4.6.5 Post-layout simulation results

The layout of the designed op amp is shown in Figure 4.24. Compared with the schematic simulation results, the post-layout simulation results of the proposed op amp show about 14.5% reduction in GBW and 3.5% reduction in supply current. The reduction in GBW is caused by the routing parasitic capacitance at the internal nodes of the op amp. The slight reduction in supply current is caused by the combination of shallow trench isolation (STI) effect, well proximity effect (WPE) and layout design with fingers. As a result, the FOMs in the post-layout simulation shows a reduction of 12.5% by comparison with the schematic simulation results are the same because the voltage swings at the gates of transistors M14 and M15 are close to rail-



to-rail supply voltage in both in both schematic and post-layout simulations. The detailed simulation results about the designed op amp's AC and transient responses with schematic and post-layout views are shown in Figure 4.25 to Figure 4.27. The schematic and post-layout simulation results are compared and shown in Table 4.7 in Section 4.7.



Figure 4.24: Layout view of the designed op amp



Figure 4.25: Proposed op amp's transient responses in schematic and post-layout simulation









Figure 4.27: Proposed op amp's frequency responses in schematic and post-layout simulation b) magnified bode plots around 0dB

# 4.7. Performance Comparison of This Work with the Literature

Table 4.7 shows the performance comparison among [3], [9] and the proposed op amp in schematic and layout views. As can be seen, the post-layout simulation results of this work show favorable performance in small-signal (FOMs), large-signal (FOM<sub>L</sub>), and settling-time



(FOM<sub>Ts\_x%</sub>) figure of merits. FOM<sub>Ts</sub> is defined as  $C_L/(I_{supply}*Ts_x%)$ , where Ts is the settling time of the op amp with x% settling accuracy. Work [9] is also redesigned in the same 0.18um CMOS process used for the proposed op amp. The redesigned op amp has the exact same transistor sizes, bias current and total supply current as reported in [9]. The redesigned op amp is not stable with a15nF capacitive load, so the performance of the redesigned op amp is shown with a 100nF capacitive load only. Under a 100nF capacitive load, compared with the redesigned op amp, the proposed op amp has a similar FOMs but a much higher phase margin, FOM<sub>Ts\_1%</sub> and FOM<sub>Ts\_0.1%</sub>. In addition, when the supply voltage changes by +/-10%, the supply current of the redesigned [9] changes by +89.3% and -48.9% respectively, whereas the proposed op amp only changes by +1.7/-1.8% respectively. As reviewed in Section 4.2.2, work [9] or the redesigned op amp's supply current is extremely sensitive to its supply voltage because its preamp stages greatly amplify current errors due to the channel length modulation effects.

Table 4.8 shows the performance comparison among [3], [9] and the proposed op amp over process corner and mismatch variations. In a typical corner, the proposed op amp's post-layout simulated FOMs is 25 times of [3] and 1.8 times of [9], while its FOM<sub>L</sub> is 1198 times of [3] and 7.8 times of [9]. Even in its worst-case scenario of schematic simulations, the FOMs of the proposed op amp is 20.6 times of [3] and 1.5 times of [9], while its FOM<sub>L</sub> is 632.6 times of [3] and 4.1 times of [9]. If trimming bits are available to trim the resistor's value in the designed op amp, variations in FOMs and FOM<sub>L</sub> can be reduced. The performance improvement of this work over [3] and [9] are mainly introduced by structurally decoupling large- and small-signal operations and eliminating any wasted current in the preamp's load



circuits. In addition, unlike [3][9], the quiescent current of all the branches in the designed op amp is well defined.

95

|                                    | <sup>+</sup> NCM,<br>JSSC'15<br>[3] | <sup>+</sup> Hybrid,<br>JSSC'16<br>[9] | *This work,<br>schematic |       | *This work,<br>post-layout |        | *[9],<br>redesi<br>gned |
|------------------------------------|-------------------------------------|----------------------------------------|--------------------------|-------|----------------------------|--------|-------------------------|
| CMOS process (µm)                  | 0.18                                | 0.13                                   | 0.                       | 18    | 0.18                       |        | 0.18                    |
| VDD (V)                            | 1.2                                 | 0.7                                    | 1                        | .5    | 1                          | 1.5    |                         |
| IDD (µA)                           | 3                                   | 24                                     | 6.                       | 44    | 6.208                      |        | 24                      |
| DC gain (dB)                       | 84                                  | ~100                                   | 93.5                     |       | 92                         | 2.09   | 80                      |
| CL (nF)                            | 15                                  | 15                                     | 15                       | 100   | 15                         | 100    | 100                     |
| GBW (MHz)                          | 0.396                               | 1.46                                   | 0.811                    | 0.125 | 0.684                      | 0.105  | 0.446                   |
| PM (°)                             | 81                                  | 66                                     | 66.4                     | 86.36 | 63.97                      | 85.97  | 64.11                   |
| SR (V/µs)                          | 0.01                                | 0.47                                   | 0.95                     | 0.14  | 0.95                       | 0.14   | 0.01                    |
| Avg. 1% settling (µs)              | 47.00                               | 1.41                                   | 0.80                     | 4.90  | 0.87                       | 4.96   | 26.15                   |
| Avg. 0.1% settling (µs)            | -                                   | -                                      | 0.85                     | 6.00  | 0.94                       | 6.66   | 27.86                   |
| FOMs (pF*MHz/µA)                   | 66                                  | 912.5                                  | 1,889                    | 1,941 | 1,653                      | 1,691  | 1,858                   |
| $FOM_L(pF*V/\mu s/\mu A)$          | 1.916                               | 293.8                                  | 2,201                    | 2,174 | 2,295                      | 2,302  | 38                      |
| $FOM_{Ts_1\%} (pF/\mu s/\mu A)$    | 106.4                               | 443.3                                  | 2,922                    | 3,169 | 2,793                      | 3,248  | 159.3                   |
| $FOM_{Ts\_0.1\%} (pF/\mu s/\mu A)$ | -                                   | -                                      | 2,756                    | 2,588 | 2,577                      | 2,420  | 149.6                   |
| Area (mm <sup>2</sup> )            | 0.0013                              | 0.0027                                 | -                        | -     | 0.0064                     | 0.0064 | -                       |

Table 4.7: Performance comparison of this work in schematic and post-layout view with recently reported amplifiers

Notes: <sup>+</sup> represents the measurement results and <sup>\*</sup> represent the simulation results

Table 4.8: Performance comparison of this work with recently reported amplifiers

|                   | <sup>+</sup> NCM,<br>ISSC'15 [3] | <sup>+</sup> Hybrid,<br>JSSC'16 [9] | *This<br>work_typ | *This<br>work_min | *This<br>work_max |
|-------------------|----------------------------------|-------------------------------------|-------------------|-------------------|-------------------|
| CMOS process (um) | 0.18                             | 0.13                                | 0.18              | 0.18              | 0.18              |
| CL (nF)           | 15                               | 15                                  | 15                | 15                | 15                |
| VDD (V)           | 1.2                              | 0.7                                 | 1.5               | 1.5               | 1.5               |
| IDD (µA)          | 3                                | 24                                  | 6.56              | 4.299             | 12.29             |
| DC gain (dB)      | 84                               | ~100                                | 92.6              | 89.12             | 95.38             |
| GBW (MHz)         | 0.396                            | 1.46                                | 0.85              | 0.47              | 1.742             |
| РМ (о)            | 81                               | 66                                  | 62.5              | 52.73             | 71.54             |
| SR (V/µs)         | 0.01                             | 0.47                                | 1.1               | 0.95              | 1.25              |


| Avg. 1% settling (µs)   | 47    | 1.41  | 0.73 | 0.41 | 1    |  |
|-------------------------|-------|-------|------|------|------|--|
| Avg. 0.1% settling (µs) | NA    | NA    | 0.93 | 0.47 | 1.72 |  |
| FOMs (pF/MHz-µA)        | 66    | 912.5 | 1940 | 1361 | 2465 |  |
| $FOM_L (pF*V/us-\mu A)$ | 1.916 | 293.8 | 2500 | 1212 | 3900 |  |

Table 4.8 (continued)

\*The minimum and maximum performance of this work is reported based on 1000-run Monte Carlo simulation with both process corner and mismatch variation enabled.

#### 4.8.Discussion

If a tighter spread of the designed op amp's GBW is needed under all process corner variations without the aid of any trimming circuits, a more sophisticated bias strategy is needed. As shown in Section 4.5.2, the GBW of the designed op amp is proportional to  $gm^{5*}R^4$ . With the constant gm bias circuit [11], gm becomes approximately proportional to 1/R and the expression of GBW is then simplified to be proportional  $g_m$  or 1/R. To reduce the GBW spread further under process corner variations, one of the preamp stages' gm needs to be constant instead of proportional to 1/R. This can be achieved by biasing the tail current of one of the preamp stages with a fixed bias current.

# 4.9.Summary

A new power-efficient design technique for op amps driving large capacitive loads has been introduced to largely boost both the op amp's small- and large-signal performance. An op amp is designed with the new technique and demonstrates ability to decouple large- and smallsignal performance, possess very well-defined quiescent current for all the preamp stages, and eliminate current waste in the preamp's load circuits. Because of these good features, the designed op amp is much less sensitive to devices' random mismatches and the op amp can be optimized for both large- and small-signal performance. The optimization between the gain bandwidth product enhancement and the number of preamp stages for the proposed op amp has also been discussed. The proposed op amp has also been simulated in a 180nm CMOS



process under three different conditions. The simulation results are found to agree well with theoretical calculations/discussions/analysis. Compared with the state-of-the-art methods [3][9], the designed op amp shows very favorable  $FOM_s$  and  $FOM_L$ . The results show that the proposed power-efficient op amp design is suitable for applications such as LCD gamma buffers where a large capacitive load is driven.

#### 4.10. References

- [1]. "LM6584 TFT-LCD Quad, 13V RRIO high output current operational amplifier" Mar.
  2013 [Online]. Available: www.ti.com/cn/lit/gpn/lm6584
- [2]. "4-channel, rail-to-rail, CMOS buffer amplifier," in Rev. B Texas Instruments, Jul.
   2004 [Online]. Available: *http://www.ti.com/product/buf04701*
- [3]. Z. Yan, P. I. Mak, M. K. Law, R. P. Martins and F. Maloberti, "Nested-Current-Mirror Rail-to-Rail-Output Single-Stage Amplifier With Enhancements of DC Gain, GBW and Slew Rate," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 10, pp. 2353-2366, Oct. 2015.
- [4]. X. Peng, W. Sansen, L. Hou, J. Wang, and W. Wu, "Impedance adapting compensation for low-power multistage amplifiers," *IEEE J. Solid State Circuits*, vol. 46, no. 2, pp. 445–451, Feb. 2011.
- [5]. S. S. Chong and P. K. Chan, "Cross feedforward cascode compensation for low-power three-stage amplifier with large capacitive load," *IEEE J. Solid State Circuits*, vol. 47, no. 9, pp. 2227–2234, Sep. 2012.
- [6]. Z. Yan, P.-I. Mak, M.-K. Law, and R. P. Martins, "A 0.016-mm2144-µW three-stage amplifier capable of driving 1-to-15 nF capacitive load with 0.95-MHz GBW," *IEEE J. Solid State Circuits*, vol. 48, no. 2, pp. 527–540, Feb. 2013.



- [7]. M. Tan and W.-H. Ki, "A cascode Miller-compensated three-stage amplifier with local impedance attenuation for optimized complex-pole control," *IEEE J. Solid State Circuits*, vol. 50, no. 2, pp. 440–449, Feb.2015.
- [8]. K. N. Leung and P. K. T. Mok, "Analysis of multistage amplifier-frequency compensation." *IEEE transactions on circuits and systems I: fundamental theory and applications* vol. 48, no. 9, pp. 1041-1056, 2001
- K. H. Mak, M. W. Lau, J. P. Guo, T. W. Mui, M. Ho, W. L. Goh, and K. N. Leung, "A Hybrid OTA Driving 15 nF Capacitive Load With 1.46 MHz GBW." *IEEE Journal* of Solid-State Circuits vol. 50, no. 11 (2015): 2750-2757.
- [10]. V. V. Ivanov, and I. M. Filanovsky, "OpAmp Gain Structure, Frequency Compensation and Stability," in Operational amplifier speed and accuracy improvement: analog circuit design with structural methodology, Springer Science & Business Media, 2006, pp. 76
- [11]. N. Talebbeydokhti, P. K. Hanumolu, P. Kurahashi and Un-Ku Moon, "Constant transconductance bias circuit with an on-chip resistor," 2006 IEEE International Symposium on Circuits and Systems, Island of Kos, 2006, pp. 4 pp.-2860



# **CHAPTER 5. CURRENT UTILIZATION EFFICIENCY ENHANCEMENT FOR FOLDED CASCODE AMPLIFIERS**

# **5.1.Introduction**

Op amps are one of the most fundamental building blocks for many analog and mixed-signal systems. Among different op amp structures, folded cascade amplifiers (FCAs) are one of the mostly widely-used architectures in single- and multi-stage op amp designs because FCAs have high gain, wide input common code range (ICMR) and reasonably large output voltage swing (OSW) [1]. PMOS input FCAs, due to their higher non-dominant poles, lower flicker noise, and lower input common mode levels, have become the primary choice over its NMOS counterpart. Moreover, PMOS input FCAs allow for input switches using a single NMOS transistor in switched-capacitor (SC) applications [2].



Figure 5.1: Schematic of a conventional folded cascode amplifier (FCA) Figure 5.1 shows a conventional PMOS input FCA. In this FCA design, the tail current (I<sub>tail</sub>) of the op amp is designed to meet the FCA's noise and GBW specifications. The cascode stage current (I<sub>b</sub>) is conventionally set as larger than 0.5I<sub>tail</sub> to avoid a long recovery time caused by



the input pair (M1-M2) working in the triode region and the cascode transistors (M7-M8) working in the cutoff region in slewing phases. In practice,  $I_b$  is usually designed to be about  $0.7*I_{tail}$  to provide some design margin over random mismatch variations [3]. Therefore, the bias current of the cascode stage is about 1.4 times of the FCA's tail current. Unfortunately, this large amount of bias current in the FCA's cascode stage not only dramatically increases the power consumption of the FCA but also degrades the FCA's noise and offset voltage performance. This is discussed further below.

The input referred offset voltage of the FCA in Figure 5.1 is calculated as (5-1), in which  $\Delta V_{12}$ ,  $\Delta V_{34}$ ,  $\Delta V_{56}$  are the offset voltages of transistors M1-M2, M3-M4 and M5-M6. Also,  $g_{mi}$  is the transconductance of transistor  $M_i$ , i=1,2...,6. To calculate noise, we assume that transistors M1-M6 have the same length, current density, and flicker noise constant for simplicity. Thus, the FCA's input referred noise power density is calculated and simplified as (5-2), where  $\overline{V_{ni}^2}$  is the noise power density of transistor Mi. K<sub>f</sub>, C<sub>ox</sub>, W<sub>1</sub>, and L<sub>1</sub> are transistor M1's oxide capacitance per unit area, width, length, and flicker noise constant. Also, k and T are respectively Boltzmann constant and temperature in Kevin. As can be seen from (5-1) and (5-2), the FCA's offset voltage and noise drop as  $g_{m3}/g_{m1}$  and  $g_{m5}/g_{m1}$  decreases. For a given targeted GBW and capacitive load (C<sub>L</sub>),  $g_{m1}$  is designed to be GBW\*C<sub>L</sub>. Therefore, a power-efficient way to reduce the FCA's noise and offset voltage is to decrease the transconductance of transistors M3-M6 via reducing their bias currents.

$$V_{\rm os} = \Delta V_{12} + \Delta V_{34} * \frac{g_{\rm m3}}{g_{\rm m1}} + \Delta V_{56} * \frac{g_{\rm m5}}{g_{\rm m1}}$$
(5-1)

$$\overline{V_{n1}^{2}} = 2\left[\overline{V_{n1}^{2}} + \overline{V_{n3}^{2}} \left(\frac{g_{m3}}{g_{m1}}\right)^{2} + \overline{V_{n5}^{2}} \left(\frac{g_{m5}}{g_{m1}}\right)^{2}\right]$$
(5-2)



$$= \left(\frac{16kT}{3g_{m1}} + \frac{2K_f}{W_1L_1C_{ox}f}\right)(1 + \frac{g_{m3}}{g_{m1}} + \frac{g_{m5}}{g_{m1}})$$

# **5.2.Literature Review**

101

#### 5.2.1 General review

In an effort to reduce an FCA's input referred offset voltage and noise, several techniques have been reported in the literature [4]-[6]. The techniques function by reducing  $g_{mc}/g_{m1}$ , where  $g_{mc}$  is the total effective transconductance of the top PMOS and bottom NMOS transistors in the FCA's cascode stage, and  $g_{m1}$  is the transconductance of the FCA's input pair. Approach [4] reduces  $g_{mc}$  by doing a resistive degeneration for the top PMOS and bottom NMOS transistors. However, this approach not only reduces the FCA's ICMR and OSW but also increases area consumption since a large degeneration resistor is placed in a low power design. Approach [5] adds a low noise preamp stage in front of the conventional FCA, but this approach significantly increases the FCA's power consumption to achieve the same slew rate performance. Approach [6] uses a new turn-around circuit and improves the FCA's noise, input offset voltage and current utilization efficiency (CUE) by reducing the cascode stage's bias current, where CUE is defined as the ratio of the FCA's tail current to its supply current. But this approach can only afford a slight decrease in the cascode stage's bias current so as to avoid a long recovery time. In addition, this approach requires a complicated frequency compensation to stabilize its new turn-around stage caused by its multiple internal loops. This significantly increases design complexity and area consumption of the FCA. Therefore, in this chapter, we propose a new output stage to enhance the FCA's CUE. The proposed output stage also improves the FCA's performance in terms of noise, offset voltage and gain.



#### 5.2.2 A state-of-the-art FCA design for CUE enhancement

Figure 5.2 shows a state-of-the-art method [6] to improve the FCA's CUE by reducing its cascode stage's bias current. As byproducts, the noise and offset performance of the FCA are also improved. The PMOS-side circuit, formed by transistors M5-M6, M9-M10 and M13-M4, is symmetric to the NMOS-side circuit formed by transistors M3-M4, M7-M8 and M11-M12. Transistors M10 and M14 respectively share the same gate voltages as transistors M9 and M13. In addition, transistors M10 and M13 respectively share the same source voltages as transistors M14 and M9. As the total drain current of transistors M10 and M14 is also the same as that of transistors M9 and M13, the DC bias voltages of node 3 and 8 are equal,  $V_8=V_3$ . Consequently, transistor M10 has a constant bias current that is the same as in transistor M9. Therefore, the drain currents of transistors M12 and M7 are also equal to I<sub>b</sub> in the NMOS-side circuits.



Figure 5.2: Rudy's FCA a) the FCA's schematic b) floating battery in the FCA Upon application of a small negative differential input voltage, the differential signal currents in M1 and M2 are respectively - $\Delta$ I and  $\Delta$ I. These differential currents cause a voltage increase by  $\Delta$ V<sub>7</sub> at node 7, whereas the voltage at node 1 stays the same since Vgs of transistor



M7 stays the same because of the constant bias current. Then,  $\Delta V_7$  is simply shifted up to node 6, so  $\Delta V_6 = \Delta V_7$ . Therefore, the gate source voltages of transistors M8 and M11 change by  $+\Delta V_7$  and  $-\Delta V_7$ , respectively. Since transistors M7-M8 and M11-M12 are symmetrically designed with the same size, the differential signal currents in M8 and M11 can be respectively found as  $-\Delta I$  and  $\Delta I$ . The signal current in M8 is ultimately copied to transistor M13 by the circuit formed by M5-M6, M9-M10 and M13-M14. Therefore, the signal currents in transistors M13 and M11 are  $-\Delta I$  and  $\Delta I$  respectively. This holds true only when  $|\Delta I|$  is less or equal to 2\* I<sub>b</sub> or when  $|\Delta V_7| \leq V_{od8}$ , where  $V_{od8}$  is the quiescent overdrive voltage of transistor M8.

However, when  $\Delta I$  exceeds 2\*I<sub>b</sub>,  $\Delta V_7 > V_{od8}$  and transistors M8 and M13-M14 work in the cutoff region, whereas M11 still works in the saturation region. In this case, the changes of the drain currents in transistors M4, M3, M8 and M11 are respectively  $\Delta I$ -I<sub>b</sub>,  $\Delta I$ -I<sub>b</sub>, -I<sub>b</sub> and 2 $\Delta I$ -I<sub>b</sub>. Therefore, if  $\Delta I$ =0.5I<sub>tail</sub>> 2I<sub>b</sub> occurs in a negative slewing phase, the drain currents of the transistors become I<sub>2</sub>=I<sub>11</sub>= I<sub>tail</sub>, I<sub>1</sub>=I<sub>8</sub>=I<sub>14</sub>=I<sub>13</sub>=0, I<sub>3</sub>=I<sub>4</sub>= I<sub>tail</sub>+I<sub>b</sub>, I<sub>5</sub>=I<sub>6</sub>=I<sub>b</sub> and I<sub>10</sub>=I<sub>b</sub>, where I<sub>i</sub> is transistor M<sub>i</sub>'s drain current and i=1,2...14. Similarly, in a positive slewing phase, I<sub>2</sub>=I<sub>11</sub>=0, I<sub>1</sub>=I<sub>8</sub>=I<sub>14</sub>=I<sub>13</sub>=I<sub>4</sub>=I<sub>14</sub>=I<sub>13</sub>=I<sub>4</sub>. As a result, Rudy's FCA [6] in Figure 5.2 has the same slew rate as a conventional FCA even when I<sub>b</sub> is smaller than 0.5\*I<sub>tail</sub>. But as mentioned, if I<sub>b</sub><0.25I<sub>tail</sub>, either M8 or M11 would work in the cutoff region during negative or positive slewing phases, which increases the recovery time of the FCA after slewing completes. This limits the lower boundary of the total bias current in this FCA's cascode stage as 4\*I<sub>b</sub>> I<sub>tail</sub>. Therefore, the maximum achievable CUE of this FCA is within 50% to avoid long recovery time.

In addition, this FCA needs a complex frequency compensation and significantly increases the FCA's area overhead. There are two translinear loops in the FCA in Figure 5.2. One loop



is M13-M5-M6-M14-M13 and another loop is M11-M3-M4-M12-M11. The two translinear loops make the circuits between nodes 7 and 8 work as a floating battery. Therefore, it can be found that the resistance at nodes 7 and 8 is about  $1/(g_{ds6}+g_{ds4}+g_{ds2})$ , where  $g_{ds6}$ ,  $g_{ds4}$  and  $g_{ds2}$  are respectively the conductance of transistors M6, M4 and M2. Due to the existence of two high impedance nodes in the FCA including node 7 or 8 and output node, a complex frequency compensation shown in Figure 5.2 is needed in [6] to stabilize the FCA, which unfortunately dramatically increases the FCA's design complexity and area overhead. In the design example, the area consumption of the compensation capacitors and resistors is as big as the FCA core. In summary, method [6] improves the FCA's CUE slightly but at the cost of significantly increased design complexity and area overhead.

# 5.3. Proposed FCA Output Stage Design for Low Noise, Offset and Power

In this section, we summarize the desired features of an effective FCA output stage. Based on the desired features, a conceptual FCA output stage design is presented. Then, actual circuits are designed to implement the conceptual FCA output stage.

#### 5.3.1 Desired features and conceptual design of a FCA output stage

A conceptual single-stage FCA design with desired features of its output stage is shown in Figure 5.3. As widely known, the FCA's input stage has a fixed trade-off among noise, power and speed. The input stage's transconductance, gm, is set to GBW\*C<sub>L</sub> to meet the GBW specification, where C<sub>L</sub> is the FCA's load capacitor. In addition, the gm must be large enough to meet the FAC's noise specification, as shown in (5-2). Assuming that the thermal noise dominates the FCA's total noise, it can be found that the input pair's gm needs to be larger than  $16*k*T*m/(3*V_{ni,spec}^2)$ , where k, T and V<sub>ni,spec</sub> are respectively the Boltzmann constant, the operation temperature in Kevin, and FCA's input referred voltage noise specification. Also,



m represents the ratio of the FCA's total noise power to the noise power from the input pair. Therefore, the input pair's gm needs to be large enough, as shown in (5-3) to meet both GBW and noise specifications. With a constant  $g_m/I_D$  ( $g_{m,efficiency}$ ) design strategy, the input pair's tail current ( $I_{tail}$ ) can be easily found as  $2*g_{m,spec}/g_{m,efficiency}$ .



Figure 5.3: Desired features of a FCA's output stage

The FCA's output stage takes the input differential current from the input pair and conveys the current to the output of the FCA. The FCA's output stage must be able to convey at least  $I_{tail}$  to the output in a large signal operation. Ideally, the output stage not only passes the differential current from the input pair but also amplifies the current and then passes the amplified current to the output node. That is, we want  $A_{i\_tran}$  to be large, where  $A_{i\_tran}$  is the ratio of the output current ( $I_{out}$ ) to the input differential current ( $I_{dm}$ ) in the large signal operation. In a small signal operation,  $I_{dm}$  is converted to output voltage by the output stage's output resistance,  $R_{out}$ . Ideally,  $R_{out}$  should be infinite to generate infinite DC gain. In addition, ideally the FCA's output stage should contribute zero offset and noise (m=1), and consume



zero bias current. In summary, a desired FCA's output stage should have a large A<sub>i</sub>\_tran or a large current conveyance capability, a very large R<sub>out</sub>, zero power consumption, zero noise and zero offset contribution. However, meeting these requirements altogether is usually very difficult because a large current conveyance capability typically requires a large bias current in the output stage, while a large bias current not only increases the FCA's power consumption, noise and offset voltage but also reduces FCA's R<sub>out</sub>. Clearly, tradeoffs need to be made among a sufficiently large current conveyance capability, a sufficiently large R<sub>out</sub>, minimal noise, minimal offset and minimal power consumption.

To mitigate the tradeoffs above, a conceptual FCA design with the proposed output stage is shown in Figure 5.4. The basic idea is to decouple the large-signal and small-signal operations. The large-signal path determines the current conveyance capability. This path is normally off so that this path needs zero bias current and contributes zero noise and zero offset. On the other hand, the small-signal path is always on with minimal bias current so that power, noise and offset caused by this path are also minimized. The gain is also maximized. A circuit implementation of the conceptual design is discussed in the following sections.





### Figure 5.4: A conceptual design of a FCA output stage

#### **5.3.2** Proposed FCA core amplifier design

For a differential-input single-ended output FCA, a differential-to-single-ended conversion circuit needs to be implemented in the FCA. The two types of the differential-to-single-ended conversion circuits with a PMOS input FCA are shown in Figure 5.5. Figure 5.5(a) implements the conversion by a top PMOS current mirror, whereas Figure 5.5(b) implements it by a bottom NMOS current mirror. The two FCAs are analyzed and their advantages and disadvantages are discussed.

The FCA in Figure 5.5a) is the most conventional design for a PMOS input FCA. In the signal path from  $V_{in+}$  to  $V_o$ , there are three low impedance nodes including nodes (1), (2), and (4). The impedances of these nodes are respectively  $1/g_{m7}$ ,  $1/g_{m5}$  and  $1/g_{m10}$ . These nodes' impedances are sensitive to the cascode stage's bias current,  $A*I_{tail}$ . In the signal path from  $V_{in-}$  to  $V_o$ , there is only one low impedance node, node (5). This node's impedance is  $1/g_{m8}$ , which is also very sensitive to the cascode stage's bias current. Therefore, all the nondominant poles in Figure 5.5(a) are highly dependent on the cascode stage's bias current.





After writing and solving the KCL equations at nodes  $(1)^{\infty}$ , the transfer function from the FCA's input to the output is calculated as (5-4), where C<sub>i</sub> is the parasitic capacitance at node i and g<sub>mi</sub> is the transconductance of transistor M<sub>i</sub>. There are three nondominant poles and two zeros in the transfer function. One pole is always located at g<sub>m7</sub>/C<sub>1</sub>. The rest two nondominant poles/zeros can be either complex or real poles/zeros, depending on whether g<sub>m10</sub>/C<sub>4</sub><4g<sub>m5</sub>/C<sub>2</sub> or not. When g<sub>m10</sub>/C<sub>4</sub><4g<sub>m5</sub>/C<sub>2</sub>, the complex pole pair's natural frequency is  $\sqrt{g_{m5}g_{m10}/(C_2C_4)}$ , whereas the complex zero pair is  $\sqrt{2g_{m5}g_{m10}/(C_2C_4)}$ . If g<sub>m10</sub>/C<sub>4</sub>>4g<sub>m5</sub>/C<sub>2</sub>, the nondominant pole at g<sub>m10</sub>/C<sub>4</sub> cancels out the zero at the same frequency. The remaining nondominant poles are at frequencies of g<sub>m5</sub>/C<sub>2</sub>, g<sub>m7</sub>/C<sub>1</sub> and the rest zero's frequency is 2g<sub>m5</sub>/C<sub>2</sub>. In summary, regardless of whether the zeros or the poles are complex or real, the frequencies of the nondominant poles and zeros are highly sensitive to the cascode stage's bias current but are independent of tail current. The smaller the bias current of the cascode stage is, the lower the frequencies of the nondominant poles, zeros and phase margin are. This fundamentally limits the lower boundary of the bias current of the cascode stage.

$$TF_{slow,FCA} = \frac{\frac{g_{m1}}{2 * g_L}}{(1 + s\frac{C_L}{g_L})\left(1 + s\frac{C_1}{g_{m7}}\right)} * \frac{s^2 + \frac{g_{m10}}{C_4}s + \frac{2 * g_{m5}g_{m10}}{C_2C_4}}{s^2 + \frac{g_{m10}}{C_4}s + \frac{g_{m5}g_{m10}}{C_2C_4}}$$
(5-4)

$$TF_{fast,FCA} = \frac{\frac{g_{m1}}{2 * g_L}}{(1 + s\frac{C_L}{g_L})\left(1 + s\frac{C_1}{g_{m7}}\right)} * \frac{s^2 + \frac{g_{m8}}{C_5}s + \frac{2 * g_{m4}g_{m8}}{C_2C_5}}{s^2 + \frac{g_{m8}}{C_5}s + \frac{g_{m4}g_{m8}}{C_2C_5}}$$
(5-5)

The alternative FCA design in Figure 5.5(b) mitigates the dependency of the nondominant poles on the cascode stage's bias current. On the signal path from  $V_{in+}$  to  $V_o$ , there are still three low impedance nodes including nodes (1), (2), and (5). As the minimum drain current of



transistor M3 is  $0.5*I_{tail}$ , the nondominant pole at node (2),  $g_{m3}/C_2$ , is always at a very high frequency even with zero cascode bias current. So, the pole is not very sensitive to the low bias current in the cascode stage. After writing and solving the KCL equations at nodes  $(1)^{(5)}$ , the transfer function from the FCA's inputs to its output is calculated as (5-5). One nondominant pole is always at  $g_{m7}/C_1$ . When  $g_{m8}/C_5 < 4g_{m4}/C_2$ , the rest two nondominant poles are complex poles with a natural frequency of  $\sqrt{g_{m4}g_{m8}/(C_2C_5)}$ , whereas two zeros are complex zeros with a natural frequency of  $\sqrt{2g_{m4}g_{m8}/(C_2C_5)}$ . As can be seen, the natural frequencies of complex poles and zeros are proportional to  $\sqrt{g_{m4}}$ , and  $g_{m4}$  is proportional to  $\sqrt{I_b + 0.5I_{tail}}$ instead of  $\sqrt{I_b}$ . Therefore, when  $I_b$  is much smaller than  $I_{tail}$ , the frequencies of the two nondominant poles and two zeros of the alternative FCA are considerably higher than the conventional FCA, especially considering that NMOS transistors' mobility is also about 2~3x of PMOS transistors. When  $g_{m8}/C_5 > 4g_{m4}/C_2$ , the two zeros and two nondominant poles of the alternative FCA become real zeros and poles. Consequently, the nondominant pole and zero at a frequency of  $g_{m8}/C_5$  are cancelled out. The frequencies of the remaining nondominant pole and zero are respectively  $g_{m4}/C_2$  and  $2g_{m4}/C_2$ , which are at much higher frequencies than the conventional FCA's pole and zero  $(g_{m5}/C_2 \text{ and } 2g_{m5}/C_2)$ .

The superior speed of the alternative FCA is also confirmed by the simulation results of the two FCA design examples in the 180nm CMOS process. The first design example uses a conventional FCA structure, whereas the second example uses the alternative structure. Because of the speed difference between the two FCAs, the two FCAs are renamed as slow and fast FCAs respectively. The simulated frequency and transient responses of the fast and slow FCAs are shown in Figure 5.6 and Figure 5.7. The fast FCA has a higher phase margin



and a slightly higher GBW. The transient responses of the two design examples are shown in Figure 5.7. Other performance of the two design examples are summarized in Table. 5.1.

In addition to a faster speed, the fast FCA structure also reduces the amount of bias voltages by one because  $V_{b4}$  is no longer needed and  $V_{b1}$  is shared with the tail current bias in the fast FCA. Furthermore, the negative slew rate of the fast FCA does not depend on the cascode stage's bias current, whereas the slow FCA does. Because of its advantages in less biasing circuits and a faster speed, the fast FCA structure is chosen as the core amplifier for the proposed FCA.



Figure 5.6: Frequency responses of the conventional fast and slow FCA



|                          | Conv slow FCA | Conv Fast FCA |
|--------------------------|---------------|---------------|
| Gain (dB)                | 80            | 80            |
| GBW (MHz)                | 1.9           | 2.1           |
| Phase Margin(degree)     | 71.4          | 82            |
| 0.1% settling time (us)  | 0.778         | 0.719         |
| 0.01% settling time (us) | 1.31          | 0.811         |
| Vno (0.01~100KHz) (uV)   | 49.3          | 49            |
| Vno (0.01~2MHz) (uV)     | 138.4         | 138.4         |
| Isupply (uA)             | 5             | 5             |
| CL (pF)                  | 1             | 1             |

Table 5.1: Performacne summary of the designed conventional slow and fast FCAs

111



**Transient Response** 

Figure 5.7: Transient responses of the fast and slow FCA

# 5.3.3 Proposed FCA output stage design

#### 5.3.3.1 Operation Principle

The conceptual design of the FCA's output stage and the findings about the FCA core amplifier design enlightens the proposed FCA design shown in Figure 5.8. The proposed FCA consists of a fast FCA core and an additional turn-around stage. The turn-around stage is normally off and is only activated during the FCA's positive slewing phase. Such design allows



the FCA's current conveyance capability to be greatly enhanced during the positive slewing phase while at the same time keeping the bias current consumption of the turn-around stage to be a minimum and generating very low noise and offset voltage. As a result, the bias current of the FCA's cascode stage can be reduced to a current much smaller than I<sub>tail</sub>. The cascode stage's bias current is annotated as  $\alpha^*I_{tail}$ , where I<sub>tail</sub> is the drain current of transistor M0. The smaller  $\alpha$  is, the less the noise, offset voltage and power consumption of the FCA are. However,  $\alpha$  cannot be indefinitely small because it affects the frequencies of the nondominant pole associated with node V<sub>x</sub> as discussed earlier. Therefore, a proper value of  $\alpha$  must be selected. In this design,  $\alpha=1/12$ .



Figure 5.8: Schematic of the proposed FCA with a new turn-around stage In the proposed FCA, there are two signal paths from the FCA's inputs to output. The first signal path, as shown by the blue lines, always conducts signal current to the output node whenever a differential input voltage exists. But the second signal path, as marked by the red lines, is activated only when  $V_{id}$ > $V_{on}$  or  $\Delta V_x > \Delta V_{x,on}$ .  $V_{id}$  is the differential input voltage.  $\Delta V_x$ 



is the voltage change at  $V_x$  node upon application of  $V_{id}$  at the input pair.  $V_{on}$  and  $\Delta V_{x,on}$  are respectively the threshold voltages of  $V_{id}$  and  $\Delta V_x$  required to activate the turn-around stage. The details about the workings of the signal paths are discussed below.

Transistor M13 is designed to be of twice the size as transistor M8 but with the same bias current. As a result, M13 works in the triode region in the quiescent operation, which leads to a low drain source voltage for M13 or makes  $V_y$  approximate  $V_x$ . When the DC bias voltage of  $V_x$  is kept less than transistor M14's threshold voltage, transistor M14 works in the cutoff region so the turn-around stage is also off in the quiescent operation.

However, upon application of a positive differential input signal, V<sub>id</sub>, the source voltage of transistor M13 would increase by  $\Delta V_x$ . Transistor M13 stays in the triode region and the turnaround stage remains off before  $V_{id}$  and  $\Delta V_x$  become as big as  $V_{on}$  and  $\Delta V_{x,on}$  respectively. When  $V_{id}=V_{on}$  and  $\Delta V_x = \Delta V_{x,on}$ , the operation region of transistor M13 transits from the triode region to saturation region. Once transistor M13 works in the saturation region, any V<sub>id</sub>>V<sub>on</sub> will quickly raise the gate voltage of M14 and turns on the turn-around stage. Therefore, the boundary between the enabling and disabling of the turn-around stage can be approximately marked by the transition of M13's operation region from the triode region to saturation region. At the transition point, the drain currents of M8 and M13 are respectively expressed as (5-6) and (5-7), where  $\beta_8 = \mu_n C_{ox} W_8 / L_8$  and  $\beta_{13} = \mu_n C_{ox} W_{13} / L_{13}$ . Also,  $V_{od8}$  and  $\Delta I_{d8}$  are respectively M8's overdrive voltage and drain current change. By dividing (5-6) by (5-7) and substituting  $\beta_{13} = 2\beta_8$ , it is found that  $\Delta I_{d8} = -\frac{\alpha}{2} * I_{tail} = -\frac{1}{24}I_{tail}$  and  $\Delta V_{x,on} = (1 - 1)^{-1} + 1$  $\left(\frac{\sqrt{2}}{2}\right)V_{od8} = 0.3V_{od8}$ . At the transition point, M14 is still off and the drain current change of M8 comes from the input differential pair. Therefore, the input referred turn-on voltage, Von, for the turn-around stage is derived as (5-8) by solving the KCL equation at M13's source node.



In (5-8),  $g_{m1}$  and  $V_{od1}$  are respectively the transconductance and overdrive voltage of transistor M1. Also, A<sub>4</sub> and A<sub>3</sub> are respectively the aspect ratios of transistors M4 and M3. In this design, A<sub>4</sub>/A<sub>3</sub>=8/7. Therefore, V<sub>on</sub> is found to be about 3mV, assuming that V<sub>od1</sub> is in the neighborhood of 70~80mV.

$$\left(V_{od8} - \Delta V_{x,on}\right)^2 * 0.5\beta_8 = \alpha * I_{tail} + \Delta I_{d8}$$
(5-6)

$$(V_{od8} - \Delta V_{x,on})^2 * 0.5\beta_{13} = \alpha * I_{tail}$$
 (5-7)

$$V_{on} = -\frac{\Delta I_{d8}}{\frac{g_{m1}}{2} \left(1 + \frac{A_4}{A_3}\right)} = \frac{\frac{1}{24} I_{tail}}{\frac{I_{tail}}{2V_{od1}} \left(\frac{A_4}{A_3} + 1\right)} = \frac{V_{od1}}{12 * \left(\frac{A_4}{A_3} + 1\right)}$$
(5-8)

When  $V_{id}$  increases to a point that  $V_{id}>V_{on}$ , transistor M14 turns on, transistor M13 works in the saturation region, and the negative feedback loop formed by M11 and M13- M14 is activated. As a result,  $\Delta V_x$  stays as  $\Delta V_{x,on}$  regardless of the differential current from M1 and M2, I<sub>dm</sub>, because the negative feedback loop makes M14 compensate I<sub>dm</sub>. Therefore, in the positive slewing phase, the drain currents of M8 and M14 respectively become  $0.5\alpha * I_{tail}$  and  $I_{tail}(1 - \alpha * A_4/A_3 + 0.5\alpha)$ . The drain current of M14 is then amplified four times to pass to the output to charge the load capacitance. This enhances the positive slew rate of the FCA. Once the FCA's output voltage decreases to a point that  $V_{id} < V_{on}$ , the FCA's turn-around stage gets deactivated and M13 returns to work in the triode region.

As noted in the above operation, transistor M8 always holds half of its quiescent bias current in the positive slewing phase, which prevents M8 from ever turning off and keeps the voltage change at  $V_x$  to be very small. As a result, the input transistor M1 does not work in triode region in the slewing phase. Therefore, although the proposed FCA has an extremely small cascode bias current, it does not require a long time to recover after the slewing phase



completes, since a long recovery time is generally caused by either transistor M8 working in the cutoff region or transistor M1 working in triode region.

In the negative slewing phase, transistor M2 steers all the tail current into transistor M3, and then transistor M4 passes the mirrored current to discharge the load capacitor via transistor M8. In the slewing phase, transistors M8 and M10's drain currents are  $[A_4/A_3^*(\alpha+1)-\alpha]^*I_{tail}$  and  $\alpha^*I_{tail}$  respectively, which results in a net discharging current of  $[A_4/A_3^*(\alpha+1)-2\alpha]^*I_{tail}$  to the load capacitor. The discharging current is slightly larger than that of the conventional FCA. The conventional FCA's discharging current is  $I_{tail}$  when its cascode bias current is larger than  $0.5^*I_{tail}$ .

#### 5.3.3.2 Frequency Response Analysis



Figure 5.9: Small signal block diagram of the proposed FCA

In order to understand the frequency response of the proposed FCA in Figure 5.8, its small signal block diagram is drawn in Figure 5.9. By writing KCL equations at nodes  $V_1$ ,  $V_2$ ,  $V_x$  and  $V_0$ , equations (5-9) to (5-12) can be obtained, where  $g_{mi}$  is transistor  $M_i$ 's transconductance.  $g_i$  and  $C_i$  are respectively the impedance and parasitic capacitance at node i. After solving these



equations, the transfer function from the FCA's inputs to its output is derived as (5-13), assuming that the transistors' transconductance are much larger than their conductance. Then (5-13) is rewritten as (5-14) and further simplified as (5-15) after substituting the expressions (5-16) into (5-14). The expressions of  $g_1$ ,  $g_2$ ,  $g_x$ ,  $g_L$ ,  $C_1$ ,  $C_2$  and  $C_x$  are shown in Table 5.2.

Table 5.2: Expressions of the conductance and capactance in the proposed FCA

| $g_1 = g_{ds2} + g_{ds3}$                                                   | $C_1 \approx C_{db2} + C_{gd2} + C_{db3} + C_{gd3} + C_{gs7}$                                    |
|-----------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
| $g_2 \approx g_{ds5} g_{ds9}/g_{m9}$                                        | $C_2 \approx C_{gs3} + C_{gd3} + C_{gs4} + C_{gd4}$                                              |
| $g_x \approx g_{ds1} + g_{ds4} + g_{ds11}$                                  | $C_{x} \approx C_{db1} + C_{gd1} + C_{db4} + C_{gd4} + C_{gs8} + C_{gs13} + C_{gd14} + C_{gd14}$ |
| $g_L \approx g_{ds6} g_{ds10}/g_{m10} + (g_{ds1} + g_{ds4}) g_{ds8}/g_{m8}$ |                                                                                                  |

$$\frac{V_{id}}{2} * g_{m1} + V_1(g_1 + sC_1) + g_{m7}V_1 + g_{ds7}(V_1 - V_2) + V_2 * g_{m3} = 0$$
(5-9)

$$V_1 * g_{m7} + (V_1 - V_2) * g_{ds7} - V_2(g_2 + sC_2) = 0$$
(5-10)

$$V_2 * g_{m4} + V_x (g_{m8} + g_{d88} + g_x + sC_x) - V_0 * g_{d88} - \frac{V_{id}}{2} * g_{m1} = 0$$
 (5-11)

$$V_{o}(g_{L} + g_{ds8} + sC_{L}) = V_{x}(g_{m8} + g_{ds8})$$
(5-12)

$$\frac{V_{o}}{V_{id}} \approx \frac{0.5 * g_{m1} * g_{m8}}{(g_{L} + sC_{L})(g_{m8} + sC_{x})} * \frac{s^{2}C_{1}C_{2} + g_{m7}sC_{2} + (g_{m3} + g_{m4})g_{m7}}{s^{2}C_{1}C_{2} + g_{m7}sC_{2} + g_{m3}g_{m7}}$$
(5-13)

$$\frac{V_{o}}{V_{id}} = \frac{\frac{g_{m1}}{2 * g_{L}}}{(1 + s\frac{C_{L}}{g_{L}})\left(1 + s\frac{C_{x}}{g_{m8}}\right)} * \frac{s^{2} + \frac{g_{m7}}{C_{1}}s + \frac{(g_{m3} + g_{m4})g_{m7}}{C_{1}C_{2}}}{s^{2} + \frac{g_{m7}}{C_{1}}s + \frac{g_{m3}g_{m7}}{C_{1}C_{2}}}$$
(5-14)

$$\frac{V_{o}}{V_{id}} = \frac{\frac{g_{m1}}{2 * g_{L}}}{(1 + s\frac{C_{L}}{g_{L}})\left(1 + \frac{s}{k_{2}GBW}\right)} * \frac{s^{2} + k_{2}GBWs + k_{1}k_{2}(1 + k_{3})GBW^{2}}{s^{2} + k_{2}GBWs + k_{1}k_{2}GBW^{2}}$$
(5-15)

$$k_1 = \frac{\frac{g_{m3}}{C_2}}{GBW}, \quad k_2 = \frac{\frac{g_{m7}}{C_1}}{GBW} = \frac{\frac{g_{m8}}{C_x}}{GBW}, \quad k_3 = \frac{g_{m4}}{g_{m3}}, GBW = \frac{g_{m1}}{C_L} * \frac{k_3 + 1}{2}$$
 (5-16)

As can be seen from (5-15), there are four poles and two zeros in the system. The locations of all the poles and zeros in the system are shown by (5-17), (5-18), (5-19), (5-20), (5-21) and (5-22). Because the drain current of M7 is much smaller than that of M3, which makes  $k_1 > k_2$ 



and  $\left(\frac{k_2}{2}\right)^2 < k_1k_2$ , the poles and zeros (P<sub>nd2</sub>, P<sub>nd3</sub>, Z<sub>nd1</sub> and Z<sub>nd2</sub>) are complex poles and zeros. The distribution of all the poles and zeros of the system in S-plane is shown in Figure 5.10. The complex poles have a lower natural frequency and a lower Q-factor compared to the complex zeros. But the complex poles and zeros are close to each other, so the phase drop due to the complex poles and zeros are small in this design. The phase drop caused by the complex poles and zeros is calculated as (5-23).



Figure 5.10: Poles and zeros distribution of the proposed FCA

$$P_{d1} = -\frac{g_L}{C_L} \tag{5-17}$$

$$P_{nd1} = -\frac{g_{m8}}{C_x} = -k_2 * GBW$$
(5-18)

$$P_{nd2} = -GBW * \left(\frac{k_2}{2} - \sqrt{\left(\frac{k_2}{2}\right)^2 - k_1 k_2}\right)$$
(5-19)

$$P_{nd3} = -GBW * \left(\frac{k_2}{2} + \sqrt{\left(\frac{k_2}{2}\right)^2 - k_1 k_2}\right)$$
(5-20)

$$Z_{nd1} = -GBW * \left(\frac{k_2}{2} - \sqrt{\left(\frac{k_2}{2}\right)^2 - k_1 k_2 (1+k_3)}\right)$$
(5-21)

$$Z_{nd2} = -GBW * \left(\frac{k_2}{2} + \sqrt{\left(\frac{k_2}{2}\right)^2 - k_1 k_2 (1+k_3)}\right)$$
(5-22)



$$\emptyset = -\tan^{-1}\left\{\frac{k_2}{k_1k_2 - 1}\right\} + \tan^{-1}\left\{\frac{k_2}{k_1k_2(1 + k_3) - 1}\right\}$$
(5-23)

$$PM = 90 - \tan^{-1}\left(\frac{1}{k_2}\right) - \tan^{-1}\left\{\frac{k_2}{k_1k_2 - 1}\right\} + \tan^{-1}\left\{\frac{k_2}{k_1k_2(1 + k_3) - 1}\right\}$$
(5-24)



Figure 5.11: Phase drop due to complex poles and zeros vs. k1 and k2



Figure 5.12: The proposed FCA's PM vs. k1 and k2

The dependency of this phase drop on the ratio of  $k_2$  to the FCA's GBW is also shown in Figure 5.11. As can be seen, the phase drop is less than 9° even when  $k_2=2$  and  $k_1=2*k_2=4$ . In



this design,  $k_1$  and  $k_2$  are about 3.5 and 2. Therefore, the expected phase drop due to the complex poles and zeros is about 5°. The FCA's phase margin is calculated as (5-24) and its dependency on the ratio of  $k_2$  to the FCA's GBW is shown in Figure 5.12. As can be from Figure 5.12, the expected phase margin of the proposed FCA is about 70° at  $k_1$ =3.5 and  $k_2$ =2.



#### 5.3.3.3 Noise Analysis

Figure 5.13: Noise model for the proposed FCA

As the proposed FCA's bias current for the cascode stage is smaller than the conventional fast FCA, it is of interest to analyze the noise impact of the proposed FCA. The noise model of the proposed FCA is shown in Figure 5.13 after neglecting the noise contributed by the cascode transistors and the transistors working in the cutoff region. The FCA's output current noise is derived as (5-25), where a transistor's voltage noise power is expressed as (5-26). The  $\frac{8KT}{3g_{mi}}$  and  $\frac{K_{f}}{W_{i}L_{i}C_{ox}f}$  in (5-26) respectively represent thermal and flicker noise. The transistors in current mirrors are typically sized to have the same length and current density. Consequently,



their widths and transconductance linearly scale with their bias currents. Therefore, their voltage noise power is linearly proportional to their bias currents, whereas their current noise power is inversely proportional to their bias currents, as shown in (5-26) and (5-27). As a result, the noise expression in (5-28) can be established. After plugging (5-28) into (5-25), the equation (5-25) is simplified as (5-29). Equation (5-29) is further simplified as (5-30) by neglecting  $\frac{I_{n5}^2}{4\alpha}(k_3 - 1)^2$  because this term is much smaller than  $I_{n5}^2(k_3^2 + 2)$ . Therefore, the input referred voltage noise power of the FCA is derived as (5-31).

$$I_{\text{no,prop}}^{2} \approx \frac{g_{\text{m0}}^{2}e_{\text{n0}}^{2}}{4} \left(\frac{g_{\text{m4}}}{g_{\text{m3}}} - 1\right)^{2} + \left(g_{\text{m3}}^{2}e_{\text{n3}}^{2} + g_{\text{m2}}^{2}e_{\text{n2}}^{2} + g_{\text{m5}}^{2}e_{\text{n5}}^{2}\right) * \frac{g_{\text{m4}}^{2}}{g_{\text{m3}}^{2}} + \left(g_{\text{m4}}^{2}e_{\text{n4}}^{2} + g_{\text{m1}}^{2}e_{\text{n1}}^{2} + g_{\text{m6}}^{2}e_{\text{n6}}^{2} + g_{\text{m1}}^{2}e_{\text{n1}}^{2}\right)$$
(5-25)

$$\frac{e_{ni}^2}{\Delta f} = \frac{8KT}{3g_{mi}} + \frac{K_f}{W_i L_i C_{ox} f} \propto \frac{1}{I_{bias}}$$
(5-26)

$$I_{ni}^{2} = \frac{e_{ni}^{2}g_{mi}^{2}}{\Delta f} = \frac{g_{mi} * 8KT}{3} + \frac{g_{mi}^{2} * K_{f}}{W_{i}L_{i}C_{ox}f} \propto I_{bias}$$
(5-27)

$$I_{n5}^{2} = I_{n6}^{2} = I_{n11}^{2} = \alpha I_{n0}^{2}; \ I_{n1}^{2} = I_{n2}^{2}; \ I_{n3}^{2} = I_{n4}^{2}/k_{3}$$
(5-28)

$$I_{no,prop}^{2} \approx \frac{I_{n5}^{2}}{4\alpha} (k_{3} - 1)^{2} + (I_{n3}^{2} + I_{n1}^{2} + I_{n5}^{2}) * k_{3}^{2} + (k_{3}I_{n3}^{2} + I_{n1}^{2} + 2I_{n5}^{2})$$
(5-29)

$$I_{no,prop}^{2} \approx I_{n1}^{2} * (1 + k_{3}^{2}) + I_{n3}^{2}(k_{3} + k_{3}^{2}) + I_{n5}^{2}(2 + k_{3}^{2})$$
(5-30)

$$V_{ni}^{2} = \frac{I_{no,prop}^{2}}{[0.5 * g_{m1} * (k_{3} + 1)]^{2}} = \frac{I_{n1}^{2} * (1 + k_{3}^{2}) + I_{n3}^{2}(k_{3} + k_{3}^{2}) + I_{n5}^{2}(2 + k_{3}^{2})}{0.5 * g_{m1} * (k_{3} + 1) * GBW * C_{L}}$$
(5-31)

$$V_{no}^{2} = V_{ni}^{2} * \frac{\pi GBW}{2 * 2\pi} = \frac{[I_{n1}^{2} * (1 + k_{3}^{2}) + I_{n3}^{2}(k_{3} + k_{3}^{2}) + I_{n5}^{2}(2 + k_{3}^{2})]}{2g_{m1} * (k_{3} + 1) * C_{L}}$$
(5-32)

$$W_{\rm no,thermal}^2 = \frac{4KT}{3} \frac{\left[(1+k_3^2) + a(k_3+k_3^2) + b(2+k_3^2)\right]}{(k_3+1)C_{\rm L}} \approx \frac{2KT}{C_{\rm L}}$$
(5-33)



$$V_{\text{no,thermal,conv}}^{2} = \frac{\frac{4\text{KT}}{3} * [2 + 2g'_{\text{m3}}/g_{\text{m1}} + 2g'_{\text{m5}}/g_{\text{m1}}]}{2C_{\text{L}}}$$

$$= \frac{4\text{KT}}{3C_{\text{L}}} * \left[1 + a * \frac{r + 0.5}{\alpha + 0.5} + b * \frac{r}{\alpha}\right] = \frac{3.2\text{KT}}{C_{\text{L}}}$$
(5-34)

When this proposed FCA is placed in the positive unity gain buffer structure, the equivalent rectangular noise bandwidth of the FCA is  $\pi/2*GBW/(2\pi)=GBW/4$ , where  $GBW = 0.5g_{m1}(k_3 + 1)/C_L$ . Therefore, the in-band output referred voltage noise power is calculated as (5-32), in which the dominant noise source for a wideband FCA is the thermal noise. The in-band thermal noise is calculated as (5-33). This equation suggests that a, b and k3 should be minimized to minimize the in-band thermal noise for a given load capacitor. In this design,  $a=g_{m3}/g_{m1}=0.4$ ,  $b=g_{m5}/g_{m1}=0.07$  and  $k3=g_{m4}/g_{m3}=8/7$ . As a result, the proposed FCA's in-band thermal noise is calculated as  $2KT/C_L$  or 91uV at T=300K and CL=1pF after plugging a, b and k3 into (5-33).

Similarly, the thermal noise of the conventional FCA counterpart in Figure 5.5(b) is found as (5-34), where  $g_{m3}$  and  $g_{m5}$  are respectively the transconductance of transistors M3 and M5 in the conventional FCA counterpart. With a typical bias current of  $r^*I_{tail}=0.67^*I_{tail}$  for the conventional FCA's cascode stage, it can be found that  $\frac{g'_{m3}}{g_{m1}} = a * \frac{r+0.5}{\alpha+0.5}$  and  $\frac{g'_{m5}}{g_{m1}} = b * \frac{r}{\alpha}$ . As a result, the integrated thermal noise voltage of the conventional FCA is obtained as 3.2KT/C<sub>L</sub> or 115uV at T=300K and CL=1pF after plugging a=0.4, b=0.07,  $\alpha$ =1/12, and r=0.67. Therefore, compared to the conventional FCA, the proposed FCA is expected to reduce the inband noise voltage by 21% or 2.03dB.



#### 5.3.3.4 Offset Voltage Analysis

The variance of transistor  $M_i$ 's threshold voltage and  $\Delta\beta_i/\beta_i$  are expressed as (5-35), where  $\beta_i = \mu C_{ox} W_i/L_i$ . In addition,  $A_{thi}^2$  and  $A_{\beta i}^2$  are mismatch coefficients, fixed parameters for a given process, of transistor  $M_i$ 's threshold voltage and feature sizes. Transistor  $M_i$ 's drain current variation caused by its random mismatch is shown in (5-36), where  $I_{di}$  and  $V_{odi}$  are respectively the transistor  $M_i$ 's quiescent current and overdrive voltage. Based on the sizing strategy of the fixed current density for transistor  $M_i$ , Equation (5-37) shows that  $M_i$ 's drain current variation is proportional to its bias current. The larger the bias current is, the larger the drain current variation is.

The input referred offset voltage of a FCA can be analyzed in a similar manner to how noise is analyzed in section 5.3.3.3. The proposed FCA's output current variation caused by the mismatches of the transistors (M1-M6) and M11 is derived as (5-37). Therefore, its input referred offset voltage,  $V_{os,prop}$ , is calculated as (5-38). In (5-38),  $c = I_{os3}^2/I_{os1}^2$  and  $d = I_{os5}^2/I_{os1}^2$ . Similarly, the input referred offset voltage for the conventional FCA,  $V_{os,conv}$ , in Figure 5.5(b) is calculated as (5-39), in which r=0.67 and  $\alpha$ =1/12. Compared to  $V_{os,conv}$ , it is clear that  $V_{os,prop}$  is smaller due to the reduced offset contribution from transistors M3 and M5. This is also confirmed by the Monte Carlo simulation results shown below.

$$\sigma_{\text{vthi}}^2 = \frac{A_{\text{thi}}^2}{W_i L_i} \quad , \quad \sigma^2(\frac{\Delta\beta_i}{\beta_i}) = \frac{A_{\beta i}^2}{W_i L_i}$$
(5-35)

$$I_{osi}^{2} = \sigma_{vthi}^{2} g_{mi}^{2} + \sigma^{2} \left(\frac{\Delta\beta_{i}}{\beta_{i}}\right) I_{di}^{2} = \frac{(A_{\beta i}^{2} V_{od}^{2} + 4A_{thi}^{2}) I_{di}^{2}}{W_{i} L_{i} V_{odi}^{2}} \propto \frac{I_{di}^{2}}{W_{i}} \propto I_{di}$$
(5-36)

$$I_{os,out}^{2} = I_{os1}^{2} * (1 + k_{3}^{2}) + I_{os3}^{2} (k_{3} + k_{3}^{2}) + I_{os5}^{2} (2 + k_{3}^{2})$$
(5-37)

$$V_{os,prop}^{2} = \frac{I_{os1}^{2} * \left[ (1 + k_{3}^{2}) + c * (k_{3} + k_{3}^{2}) + d * (2 + k_{3}^{2}) \right]}{[0.5 * g_{m1} * (k_{3} + 1)]^{2}}$$
(5-38)



$$\approx \frac{2I_{os1}^2}{g_{m1}^2} (1 + c + 1.5d); \quad c = \frac{I_{os3}^2}{I_{os1}^2}; d = \frac{I_{os5}^2}{I_{os1}^2}$$
$$V_{os,conv}^2 = \frac{2(I_{os1}^2 + I_{os3,conv}^2 + I_{os5,conv}^2)}{g_{m1}^2} = \frac{2I_{os1}^2}{g_{m1}^2} (1 + c * \frac{r + 0.5}{\alpha + 0.5} + d * \frac{r}{\alpha}) \quad (5-39)$$

### **5.4. Simulation Results for Proposed FCA vs. Conventional Fast FCA**

In order to confirm the effectiveness and robustness of the improved current utilization efficiency brought by the proposed FCA, two design examples are implemented in the 180nm CMOS process. The first design example is the conventional (conv.) fast FCA shown in Figure 5.5(b). The second design example is the proposed (prop.) FCA shown in Figure 5.8. Extensive simulations under various process corner variations, mismatch variations and process corner plus mismatch variations are conducted to compare the two design examples. The purposes of the simulations are twofold: a) to verify that the proposed FCA largely improves the FCA's current utilization efficiency (CUE); and b) to verify that noise, offset voltage, and gain are also improved as byproducts from improvement in the FCA's CUE.

All the simulation results below are collected with the design examples placed in a noninverting unity gain buffer configuration with a load capacitor of 1pF and supply voltage of 1.8V. The nominal bias currents of the proposed and conventional op amp are respectively  $3.5\mu$ A and  $1.88\mu$ A but with the same tail current of  $1.5\mu$ A.

#### **5.4.1** Typical corner simulation results

#### 5.4.1.1 Frequency Response

The frequency responses of the proposed and conventional FCAs are shown in Figure 5.14. The DC gain of the proposed FCA, 89.7dB, is about 8dB higher than that of the conventional



FCA, 83.5dB. The two FCAs have almost the same GBW of 2MHz. The phase margins of the conventional and proposed FCA are respectively  $74^{\circ}$  and  $70^{\circ}$ . The simulated PM of the proposed FCA agrees very well with the theoretical calculation in Section 5.3.3.2. The slight phase margin difference is caused by a much lower bias current in the proposed FCA's cascode stage. In the two design examples, the cascode stage's bias currents in the proposed and conventional FCA are respectively 0.083 times and 0.67 times of I<sub>tail</sub>.



Figure 5.14: Frequency responses of the proposed and conventional FCAs



#### 5.4.1.2 Noise Performance

The simulated noise performance of the two FCAs are shown in Figure 5.15. For example, the noise densities of the proposed and conventional FCAs at 100KHz are respectively 68nV/sqrt(Hz) and 88.4nV/sqrt(Hz). The noise reduction of the proposed FCAs is a natural byproduct of the bias current reduction in the cascode stage. The total integrated noise from 0.01Hz to 2MHz (FCA's GBW) for the proposed and conventional FCAs are respectively 93.2µV and 127.4µV. That is to say, compared with conventional FCA, the proposed FCA reduces noise by 27%.



Figure 5.15: Noise performance of the proposed and conventional FCAs

#### 5.4.1.3 Transient Response

Figure 5.16 shows the step responses of the two FCAs with an input step voltage of 0.6V. As expected, the positive slew rate (SR+) of the proposed FCA is larger than the conventional FCA due to its inclusion of a turn-around stage. The positive and negative slew rate (SR+ and SR-) of the proposed FCA are  $SR_{prop} = +5.84V/\mu s$  and  $SR_{prop} = -1.49V/\mu s$ , whereas those of the conventional FCA are  $SR_{conv} = +1.1V/\mu s$  and  $SR_{conv} = -1.34V/\mu s$ . That is to say, the



positive and negative SR improvement brought by the proposed FCA are 5.3 times and 1.1 times. The average SR improvement of the proposed FCA is 3.67 times. The simulated SR+ improvement is slightly higher than the calculated improvement factor of 4, due to length modulation effects of the current mirror M14-M15. The simulated SR- improvement matches very well with the theoretical calculation.



Figure 5.16: Transient responses of the proposed and conventional FCAs In addition, the settling times for the two FCAs are respectively 0.5µs and 0.75µs with an accuracy of 0.1% (Ts\_0.1%) and 0.72µs and 1.08µs with an accuracy of 0.01% (Ts\_0.01%). Therefore, the average Ts\_0.1% and Ts\_0.01% of the proposed FCA are both shorter than those of the conventional FCA by 34%. These simulation results match with the theoretical calculation results for 0.1% (7/GBW) and 0.01% (9/GBW) accuracy on settling time. This confirms that a long recovery time is not needed by the proposed FCA though its cascode bias current is much smaller than its tail current.



#### 5.4.1.4 Performance Summary for Typical Corner Simulation

The performance of the two design examples are summarized in Table 5.3. Both the proposed and conventional FCAs have the same tail current of  $1.5\mu$ A, but the total bias currents of their cascade stages are respectively  $2\mu$ A and  $0.38\mu$ A. As a result, their supply currents are respectively  $1.88\mu$ A and  $3.5\mu$ A, and their current utilization efficiency (CUE) are respectively 80% and 42%, where CUE is defined as the ratio of the tail current to the FCA's supply current. Therefore, compared with the conventional FCA, the proposed FCA increases the CUE by 2 times, enhances the average slew rate by 3 times, reduces Ts\_0.1% and Ts\_0.01% by 34%, and reduces the in-band noise by 27%.

Compared with the conventional FCA, the proposed FCA improves the small signal figure of merit (FOM<sub>s</sub>) and the large signal figure of merit (FOM<sub>L</sub>) by 2 times and 5.5 times respectively. The FOM<sub>s</sub> and FOM<sub>L</sub> shown in (5-40) are used to compare op amps' GBW and slew rate per unit supply current and have been used as a conventional measure to compare op amp performance. The general idea is that an op amp with a larger FOM<sub>s</sub> and FOM<sub>L</sub> tends to work faster for a given supply current budget and a given load capacitor. However, because neither FOM<sub>s</sub> nor FOM<sub>L</sub> contains the settling between the fastest large signal slewing and small signal settling, this general idea may not be valid in some cases. For example, some op amps with a slew rate enhancement (SRE) circuit have three slewing phases. In the first slewing phase, the SRE circuit is not activated. In the second slewing phase, the SRE circuit is turned on to enhance slew rate. In the third phase, the SRE circuit is deactivated followed by a small signal settling. In the second slewing phase, some op amps may work in highly nonlinear regions, where the op amp's internal voltages and currents deviate far away from the op amps' internal voltages and currents in a quiescent status. As a result, a long recovery time may be



needed to recover the internal voltages and currents to their quiescent status. However, this long recovery time cannot be captured by either FOM<sub>s</sub> or FOM<sub>L</sub>. Therefore, we propose a new figure of merit to compare op amps' normalized settling time because a normalized settling time is the ultimate speed requirement for a system. The proposed figure of merit is also able to capture slow settling behavior such as long recovery times. Equation (5-41) shows the expression of the settling time figure of merit (FOM<sub>Ts\_x%</sub>) with a settling accuracy of x%, where Ts\_x% is the op amps' settling time with x% settling accuracy in a noninverting buffer configuration. The values of x can be 1, 0.1, 0.01 and 0.001 depending on the targeted application settling accuracy requirement. The larger the FOM<sub>Ts\_x%</sub> is, the faster the op amp is. Compared with the conventional FCA, the proposed FCA improves both FOM<sub>Ts\_0.1%</sub> and FOM<sub>Ts\_0.01%</sub> by 2.8 times.

$$FOM_{s} = \frac{GBW * C_{L}}{I_{supply}}; FOM_{L} = \frac{SR * C_{L}}{I_{supply}}$$
(5-40)

$$FOM_{Ts_x\%} = \frac{C_L}{Ts_x\% * I_{supply}} , x = 1, 0.1, 0.01 ...$$
(5-41)

$$FOM_{noise} = \frac{V_{ni,total}^2}{V_{ni,input pair}^2}$$
(5-42)

In order to fairly compare the noise performance of op amps, we would also like to define a noise figure of merit,  $FOM_{noise}$ , whose expression is shown in (5-42). The purpose of  $FOM_{noise}$  is to identify the percentage of the integrated noise contribution from the input pair to the total integrated input referred noise. If all the noise of an op amp comes from the input pair, then  $FOM_{noise}=1$ . A larger  $FOM_{noise}$  represents more noise coming from devices other than the input pair, which signals poorer noise performance of an op amp. In the two designed FCAs, the  $FOM_{noise}$  of the proposed and conventional FCAs are respectively 2.6 and 5, meaning that the  $FOM_{noise}$  of the proposed FCA is improved by 2.45 times compared with the conventional



FCA. As discussed, this noise performance improvement is a natural byproduct of reducing the cascode stage's bias current.

| Output                                              | Unit        | Prop. | Conv. |
|-----------------------------------------------------|-------------|-------|-------|
| GBW                                                 | MHz         | 2.14  | 2.04  |
| РМ                                                  | degree      | 70    | 74    |
| DC Gain                                             | dB          | 89.69 | 83.5  |
| Isupply                                             | μA          | 1.88  | 3.5   |
| Iwaste                                              | μA          | 0.38  | 2     |
| Itail                                               | μA          | 1.5   | 1.5   |
| Iwaste/Itail                                        | %           | 25.26 | 133.5 |
| Current utilization efficiency (Itail/Isupply)      | %           | 80    | 42    |
| SR_avg                                              | V/µs        | 3.67  | 1.2   |
| 0.1% Settling time @Vstep=0.6V                      | μs          | 0.5   | 0.75  |
| 0.01% Settling time @Vstep=0.6V                     | μs          | 0.72  | 1.08  |
| Vni @ 100KHz                                        | nV/sqrt(Hz) | 68.04 | 88.4  |
| Vni integrated to 2MHz                              | μV          | 93.12 | 127.4 |
| FOMs                                                | pF*MHz/µA   | 1.14  | 0.56  |
| FOML                                                | pF*V/µA-µs  | 1.95  | 0.35  |
| FOM <sub>Ts_0.1%</sub>                              | pF/µA-µs    | 1.07  | 0.38  |
| FOM <sub>Ts_0.01%</sub>                             | pF/μA-μs    | 0.74  | 0.26  |
| FOM <sub>noise</sub> (total noise/input pair noise) | $(V/V)^{2}$ | 2.6   | 5     |
| CL                                                  | pF          | 1     | 1     |
| Vsupply                                             | V           | 1.8   | 1.8   |

Table 5.3: Performance summary of the proposed and conventional FCAs

# 5.4.2 Process corner and temperature variation simulation results

In this section, the designed two FCAs are simulated under process corner and temperature (P.T.) variations ranging from -40°C to 85°C. The purposes of the simulations are twofold: a) to verify the robustness of the proposed FCA under P.T. variations; and b) to confirm the advantages of the proposed FCA under P.T. variations. Simulations of the designed FCAs are



set up to cover frequency response, transient response and noise performance since it is known that these are the elements that are commonly impacted by P. T. variations. The independent process corners variations and temperature variations are listed in Table. 5.4. In total, there are 25 simulation setups including 1 typical corner and 24 combinations of P.T. variation.

|              | Typical | Corners              |
|--------------|---------|----------------------|
| Temperature  | 27°C    | -40°C, 27°C and 85°C |
| Low Vth MOS  | tntp    | snsp, snwp,wnsp,wnwp |
| High Vth MOS | tntp    | snsp, wnwp           |

Table 5.4: Simulation setup with process corner and temperature variation

## 5.4.2.1 Frequency Response



Figure 5.17: Frequency responses of the two FCAs a) proposed b) conventional Figure 5.17 shows the frequency responses of the proposed and conventional FCAs under P.T. variations. The (min, typ, max) of the proposed FCA's simulated DC gain, phase margin (PM) and GBW are respectively (83dB, 89.7dB, 89.7dB), (66.4°, 70°, 72.5°) and (1.76MHz,



2.14MHz, 2.4MHz). On the other hand, the (min, typ, max) of the conventional FCA's simulated DC gain, phase margin (PM) and GBW are respectively (80dB, 83.5dB, 83.5dB),  $(73.7^{\circ}, 74^{\circ}, 74.5^{\circ})$  and (1.7MHz, 2.0MHz, 2.2MHz). The variations of DC gain, PM and GBW of both conventional and proposed FCA are small. The lowest DC gain of the proposed FCA is captured in the corner of fast NMOS when T=85°C. In this corner, transistor M14 in Figure 5.8 is weakly on, which consequently reduces the output impedance of the FCA. Nevertheless, the DC gain of the proposed FCA is always higher than the conventional FCA.

#### Noise performance under P.T. variation x 10<sup>-4</sup> Voltage noise density (V/sqrt(Hz)) Prop. FCA 1.2 0.8 0.6 0.4 0.2 10<sup>-1</sup> x 10<sup>-4</sup> 0 10<sup>-2</sup> 10<sup>0</sup> 10<sup>2</sup> 10<sup>5</sup> 10<sup>6</sup> 10<sup>1</sup> 10<sup>3</sup> 10<sup>4</sup> $10^{7}$ 10<sup>8</sup> Frequency (Hz) Voltage noise density (V/sqrt(Hz)) Conv. FCA 1 .5 1 0.5 0 10<sup>-2</sup> 10<sup>0</sup> 10<sup>5</sup> 10<sup>-1</sup> 10<sup>4</sup> 10<sup>6</sup> 10<sup>1</sup> 10<sup>2</sup> 10<sup>3</sup> 10<sup>7</sup> 10<sup>8</sup> Frequency (Hz)

#### 5.4.2.2 Noise Performance

Figure 5.18: Noise performance of the prop. and conv. FCAs under P.T. variation The simulated noise performance of the proposed and conventional FCAs under P.T. variations are shown in Figure 5.18. The input referred voltage noise densities of the proposed and


conventional FCAs are respectively 55.7~87.5 nV/sqrt(Hz) and 72.3~115nV/sqrt(Hz) at a frequency of 100KHz. The integrated noise from 0.01Hz to 2MHz of the proposed and conventional FCAs are respectively 76.7~118.9 $\mu$ V and 104.6~161.1 $\mu$ V. Therefore, both the minimum and maximum of the proposed FCA's integrated noise are 26% lower than the conventional FCA.



#### 5.4.2.3 Transient Response

Figure 5.19: Transient responses of the prop. and conv. FCAs under P.T. variation

The transient responses of the two FCAs are simulated in the noninverting unity gain buffer configuration with an input step voltage of 0.6V under P.T. variations. The simulation results are shown in Figure 5.19. Both the positive and negative slew rates of the proposed and conventional FCAs show a very small spread under P.T. variations. This indicates the robustness of the proposed FCA in its positive slew rate enhancement. This robustness over P.T. variations is expected because the tail current in the positive slewing phase is amplified by a well-defined current gain, and then the amplified current is passed to the load capacitor by the turn-around stage. The mean SRs of the proposed and conventional FCAs range from  $3.0 \sim 4.4 V/\mu s$  and  $1.2 \sim 1.24 V/\mu s$  respectively. Also, the mean Ts\_0.1% of the proposed and



conventional FCAs range from 0.43~0.63µs and 0.72~0.78µs respectively. This clearly shows the speed advantage of the proposed FCA over the conventional FCA. It also shows that the proposed FCA does not suffer from any long recovery time under P.T. variations.

## 5.4.2.4 Performance Summary for P.T. Variation

The performance summary of the proposed and conventional FCAs under P.T. variations is shown in Table. 5.5. Compared with the conventional FCA, the proposed FCA shows the advantages in terms of GBW, DC gain, settling time and supply current under P.T. variations. This clearly demonstrates the advantages and robustness of the proposed FCA.

|                         |             | Proposed FCA |       | Conventional FCA |       |       |       |
|-------------------------|-------------|--------------|-------|------------------|-------|-------|-------|
| Output                  | Unit        | Min          | Max   | Тур              | Min   | Max   | Тур   |
| GBW                     | MHz         | 1.76         | 2.4   | 2.14             | 1.7   | 2.2   | 2.04  |
| PM                      | 0           | 66.4         | 72.5  | 70               | 73.7  | 74.5  | 74    |
| DC Gain                 | dB          | 83           | 89.7  | 89.7             | 80    | 83.5  | 83.5  |
| SR_avg                  | V/µs        | 3.02         | 4.41  | 3.67             | 1.2   | 1.21  | 1.23  |
| Ts_0.1% @Vstep=0.6V     | μs          | 0.43         | 0.63  | 0.5              | 0.72  | 0.78  | 0.75  |
| Ts_0.01% @Vstep=0.6V    | μs          | 0.64         | 0.85  | 0.72             | 0.87  | 1.03  | 0.93  |
| Vni@ 100KHz             | nV/sqrt(Hz) | 55.7         | 87.5  | 68.0             | 72.3  | 113.8 | 88.4  |
| Vno integrated to 2MHz  | μV          | 76.7         | 118.9 | 93.1             | 104.6 | 161.1 | 127.4 |
| FOMs                    | pF*MHz/µA   | 0.94         | 1.28  | 1.14             | 0.48  | 0.63  | 0.56  |
| FOML                    | pF*V/µA-µs  | 1.6          | 2.34  | 1.95             | 0.345 | 0.354 | 0.352 |
| FOM <sub>Ts_0.1%</sub>  | pF/µA-µs    | 0.85         | 1.23  | 1.07             | 0.37  | 0.4   | 0.38  |
| FOM <sub>Ts_0.01%</sub> | pF/µA-µs    | 0.62         | 0.84  | 0.74             | 0.25  | 0.28  | 0.27  |
| Isupply                 | μA          | 1.88 3.5     |       |                  |       |       |       |
| Iwaste                  | μA          | 0.38 2.0     |       |                  |       |       |       |
| Itail                   | μA          | 1.5 1.5      |       |                  |       |       |       |
| CUE (Itail/Isupply)     | %           | 80 43        |       |                  |       |       |       |
| CL                      | pF          | 1.0          |       |                  |       |       |       |
| Vsupply                 | V           | 1.8          |       |                  |       |       |       |
| Process                 |             | 180nm CMOS   |       |                  |       |       |       |

Table 5.5: Performance summary of the prop. and conv. FCAs under P.T. variation



#### **5.4.3** Mismatch variation simulation results

This section details the results of the two designed FCAs simulated under mismatch variations via the 500-run Monte Carlo simulation. The purposes of the simulations are twofold: a) to verify the robustness of the proposed FCA under mismatch variations; and b) to confirm the advantages of the proposed FCA under mismatch variations. The simulated performance includes transient response, offset voltage, DC gain and supply current.

The simulated transient responses of the proposed and conventional FCAs under mismatch variations are shown in Figure 5.20. The slew rates of both the proposed and conventional FCAs show very small variations under device mismatch variations. The (mean, sigma) of the slew rate of the proposed and conventional FCAs are respectively  $(3.66V/\mu s, 0.433V/\mu s)$  and  $(1.23V/\mu s, 0.024V/\mu s)$ . The (mean, sigma) of Ts 0.01% of the proposed and conventional FCAs are respectively  $(0.72\mu s, 0.02\mu s)$  and  $(1.08\mu s, 0.02\mu s)$ . In addition to a faster speed, the proposed FCA also has a smaller random offset voltage. The (mean, sigma) of the offset voltages of the proposed and conventional FCAs are respectively (-0.14mV, 2.09mV) and (-0.14mV, 2.95mV). Therefore, the proposed FCA's offset voltage is decreased by about 30%. The random mismatch has a negligible impact on the DC gain and supply current of the two FCAs. Both FCAs have a tail current of  $1.5\mu$ A. As discussed before, I<sub>tail</sub> is determined based on GBW and noise specifications. Any extra current other than  $I_{tail}$  is considered as the FCA's wasted current, Iwaste. The normalized wasted current (Iwaste/Itail) of the proposed and conventional FCAs are respectively 25% and 133%. The CUE of the proposed and conventional FCAs are respectively 80% and 43%.





Figure 5.20: Transient responses of the prop. and conv. FCAs under mismatch variation

## 5.4.3.1 Performance Summary for Mismatch Variation Simulation

The performance summary of the proposed and conventional FCAs under mismatch variations is shown in Table. 5.6. Compared with the conventional FCA, the proposed FCA's CUE is improved from 43% to 80% by significantly reducing its bias current in the cascode stage. Due to the significantly reduced bias current in the proposed FCA's cascode stage, the integrated noise, offset and gain performance of the proposed FCA are also respectively improved by 27%, 29% and 6dB. More importantly, the average Ts\_0.1% and Ts\_0.01% are also both improved by 34% in the proposed FCA. The significant supply current reduction and moderate improvement on settling time, noise, offset voltage and DC gain under mismatch variations clearly demonstrate the advantages and robustness of the proposed FCA.



|                         |             | Propos     | ed    | Conventional |       |
|-------------------------|-------------|------------|-------|--------------|-------|
| Output                  | Unit        | Mean       | Stdev | Mean         | Stdev |
| Vos                     | mV          | -0.14      | 2.092 | -0.14        | 2.94  |
| GBW                     | MHz         | 2.14       | 0.044 | 2.04         | 0.033 |
| PM                      | degree      | 69.78      | 0.606 | 74           | 0.6   |
| DC Gain                 | dB          | 89.68      | 0.069 | 83.5         | 0.9   |
| SR_avg                  | V/µs        | 3.66       | 0.434 | 1.23         | 0.024 |
| Ts_0.1%                 | μs          | 0.50       | 0.013 | 0.75         | 0.02  |
| Ts_0.01%                | μs          | 0.72       | 0.021 | 1.08         | 0.02  |
| Vni @ 100KHz            | nV/sqrt(Hz) | 68.1       | 0.535 | 88.5         | 1.85  |
| Vni integrated to 2MHz  | μV          | 93.22      | 0.633 | 127.6        | 1.88  |
| FOMs                    | pF*MHz/µA   | 1.14       | 0.015 | 0.57         | 0.026 |
| FOML                    | pF*V/µA-µs  | 1.95       | 0.233 | 0.35         | 0.015 |
| FOM <sub>Ts_0.1%</sub>  | pF/µA-µs    | 1.08       | 0.025 | 0.38         | 0.013 |
| FOM <sub>Ts_0.01%</sub> | pF/µA-µs    | 0.74       | 0.019 | 0.27         | 0.011 |
| Isupply                 | μA          | 1.88       | 0.032 | 3.5          | 0.19  |
| Iwaste                  | μA          | 0.38       | 0.005 | 2.0          | 0.177 |
| Itail                   | μA          | 1.50       | 0.032 | 1.50         | 0.032 |
| Iwaste/Itail            | %           | 25.26      | 0.37  | 134          | 11.8  |
| CUE (Itail/Isupply)     | %           | 80         | 1.7   | 42.8         | 0.16  |
| CL                      | pF          | 1.00       | NA    | 1.00         | NA    |
| Vsupply                 | V           | 1.8        | NA    | 1.8          | NA    |
| Process                 |             | 180nm CMOS |       |              |       |

Table 5.6: Performance summary of the prop. and conv. FCA under mismatch variation

# 5.4.4 Process corner plus mismatch variation simulation results

In this section, the two designed FCAs are simulated under both process corner and mismatch (P.Mis) variations via the 500-run Monte Carlo simulation. The purposes of the simulations are twofold: a) to verify the robustness of the proposed FCA under P.Mis variations; and b) to confirm the advantages of the proposed FCA under P.Mis variations. The simulated performance discussed in this section is the transient response. The FOM<sub>s</sub>, FOM<sub>L</sub>, FOM<sub>Ts\_0.1%</sub>, and FOM<sub>Ts\_0.01%</sub> of the FCAs are also reported.





Figure 5.21: Transient responses of the prop. and conv. FCAs under P.Mis. variation Figure 5.21 shows the simulated transient responses of the proposed and conventional FCAs under P.Mis variations. Figure 5.22 and Figure 5.23 respectively show the histograms of the average Ts\_0.01% of the proposed and conventional FCAs under P.Mis variations.

The average SRs and settling times of both two FCAs show normal distributions. The (mean, sigma) of the proposed and conventional FCAs' SRs are respectively ( $3.61V/\mu$ s,  $0.49V/\mu$ s) and ( $1.23V/\mu$ s,  $0.024V/\mu$ s). The (mean, sigma) of the proposed and conventional FCAs' Ts\_0.1% are respectively ( $0.5\mu$ s,  $0.014\mu$ s) and ( $0.75\mu$ s,  $0.02\mu$ s). In addition, the (mean, sigma) of the proposed and conventional FCAs' Ts\_0.1% are respectively ( $0.5\mu$ s,  $0.014\mu$ s) and ( $0.75\mu$ s,  $0.02\mu$ s). In addition, the (mean, sigma) of the proposed and conventional FCAs' Ts\_0.1% are respectively ( $0.72\mu$ s,  $0.027\mu$ s) and ( $0.108\mu$ s,  $0.021\mu$ s). The (mean, sigma) of the proposed and conventional FCAs' offset voltages are respectively (-0.12mV, 2.2mV) and (0.11mV, 3.1mV). Most importantly, all the performance improvement brought by the proposed FCA is achieved yet with much smaller power consumption which is about 53.7% of the conventional FCA. Therefore, the (mean, sigma) of the proposed and conventional FCAs' FOM<sub>s</sub> are ( $1.14 \text{ pF*MHz}/\mu$ A,  $0.02 \text{ pF*MHz}/\mu$ A) and ( $0.56 \text{ pF*MHz}/\mu$ A,  $0.26 \text{ pF*MHz}/\mu$ A). The (mean, sigma) of the two FCAs' FOM<sub>L</sub> are ( $1.92 \text{ pF*V}/\mu$ A- $\mu$ s,  $0.26 \text{ pF*V}/\mu$ A- $\mu$ s) and ( $0.35 \text{ pF*V}/\mu$ A- $\mu$ s,  $0.025 \text{ pF*V}/\mu$ A- $\mu$ s). The (mean, sigma) of the two FCAs' FOM<sub>TS\_0.01%</sub> are ( $0.74 \text{ pF}/\mu$ A- $\mu$ s, 0.026 pF



 $pF/\mu A-\mu s$ ) and (0.266  $pF/\mu A-\mu s$ , 0.012  $pF/\mu A-\mu s$ ). Therefore, compared with the conventional FCA, the average improvement of FOM<sub>s</sub>, FOM<sub>L</sub> and FOM<sub>TS\_0.01%</sub> are respectively 2 times, 5.5 times and 2.8 times.



Figure 5.22: Average Ts\_0.01% of the proposed FCA under P.Mis. variation



Figure 5.23: Average Ts\_0.01% of the conventional FCA under P.Mis. variation

# 5.4.4.1 Performance Summary for P.Mis Variation

The performance summary of the proposed and conventional FCAs are shown in Table 5.7. Compared with the conventional FCA, the proposed FCA not only reduces its power consumption but also improves its settling, noise, offset and DC gain performance under P.Mis



variations. In addition, the superiority of the proposed FCA is very robust under process corner and device random mismatch variations.

|                         |             | Proposed   |       | Conventional |       |
|-------------------------|-------------|------------|-------|--------------|-------|
| Output                  | Unit        | Mean       | Stdev | Mean         | Stdev |
| Vos                     | mV          | -0.12      | 2.175 | 0.11         | 3.11  |
| GBW                     | MHz         | 2.14       | 0.044 | 1.98         | 0.034 |
| PM                      | degree      | 69.77      | 0.639 | 73.8         | 0.64  |
| DC Gain                 | dB          | 89.44      | 0.510 | 82           | 3.4   |
| SR_avg                  | V/µs        | 3.61       | 0.494 | 1.23         | 0.024 |
| Ts_0.1%                 | μs          | 0.50       | 0.014 | 0.75         | 0.02  |
| Ts_0.01%                | μs          | 0.72       | 0.027 | 1.08         | 0.02  |
| Vni @ 100KHz            | nV/sqrt(Hz) | 68.07      | 0.777 | 88.6         | 1.9   |
| Vno integrated to 2MHz  | μV          | 93.20      | 0.947 | 128          | 2.0   |
| FOMs                    | pF*MHz/µA   | 1.14       | 0.019 | 0.56         | 0.006 |
| FOML                    | pF*V/µA-µs  | 1.92       | 0.262 | 0.35         | 0.003 |
| FOM <sub>Ts_0.1%</sub>  | pF/µA-µs    | 1.08       | 0.028 | 0.38         | 0.003 |
| FOM <sub>Ts_0.01%</sub> | pF/µA-µs    | 0.73       | 0.022 | 0.265        | 0.004 |
| Isupply                 | μA          | 1.88       | 0.028 | 3.5          | 0.19  |
| Iwaste                  | μA          | 0.38       | 0.005 | 2.0          | 0.177 |
| Itail                   | μA          | 1.50       | 0.028 | 1.50         | 0.028 |
| Iwaste/Itail            | %           | 25.25      | 0.362 | 134.9        | 11.7  |
| CUE (Itail/Isupply)     | %           | 80         | 1.7   | 42.8         | 0.16  |
| CL                      | pF          | 1.00       | NA    | 1.00         | NA    |
| Vsupply                 | V           | 1.8        | NA    | 1.8          | NA    |
| Process                 |             | 180nm CMOS |       |              |       |

Table 5.7: Performance summary of the prop. and conv. FCA under P.Mis variation

# **5.5.Performance Comparison of This Work with the literature**

Table 5.8 summarizes the performance of the proposed FCA compared with the conventional FCA and [6] in a typical corner at room temperature. Compared with [6] and the conventional FCA, the proposed FCA reduces its I<sub>waste</sub> by 2.89 times and 5.3 times, which consequently increases its CUE by 1.33 times and 1.9 times respectively. In addition, as byproducts of minimizing the bias current of the proposed FCA's cascode stage, its random offset voltage is reduced to 82.5% of [6] and 73.8% of the conventional FCA. Similarly, the proposed FCA's



integrated noise from 0.01Hz to 2MHz is reduced to 82.5% of [6] and 72.9% of the conventional FCA.

| Output                  | Unit        | This work  | [6]   | Conv. FCA |
|-------------------------|-------------|------------|-------|-----------|
| Vos                     | mV          | 2.18       | 2.64  | 2.95      |
| GBW                     | MHz         | 2.14       | 2.2   | 2.0       |
| PM                      | degree      | 70         | 70    | 74        |
| DC Gain                 | dB          | 89.7       | 92.75 | 83.5      |
| Isupply                 | μA          | 1.88       | 2.6   | 3.5       |
| Iwaste                  | μA          | 0.38       | 1.1   | 2.0       |
| Itail                   | μA          | 1.50       | 1.50  | 1.50      |
| Iwaste/Itail            | %           | 25.25      | 73.3  | 133.3     |
| CUE (Itail/Isupply)     | %           | 80         | 60    | 43        |
| SR_avg                  | V/µs        | 3.66       | 1.355 | 1.23      |
| Ts_0.1%                 | μs          | 0.50       | 0.855 | 0.75      |
| Ts_0.01%                | μs          | 0.72       | 1.6   | 1.08      |
| Vni @ 100KHz            | nV/sqrt(Hz) | 68.07      | 82.2  | 88.5      |
| Vno integrated to 2MHz  | μV          | 93.20      | 113.0 | 127.5     |
| FOMs                    | pF*MHz/µA   | 1.14       | 0.87  | 0.565     |
| FOML                    | pF*V/µA-µs  | 1.92       | 0.52  | 0.352     |
| FOM <sub>Ts_0.1%</sub>  | pF/µA-µs    | 1.08       | 0.449 | 0.38      |
| FOM <sub>Ts_0.01%</sub> | pF/µA-µs    | 0.73       | 0.24  | 0.268     |
| FOM <sub>noise</sub>    | $(V/V)^2$   | 2.6        | 3.6   | 5.0       |
| CL                      | pF          | 1.00       | 1.00  | 1.00      |
| Process                 |             | 180nm CMOS |       |           |

Table 5.8: Performance comparison of the proposed FCA to the state-of-the-art methodand the conventional FCA

Moreover, Ts\_0.1% of the proposed FCA is reduced to 58% of [6] and 66.7% of the conventional FCA, whereas Ts\_0.01% of the proposed FCA is reduced to 45.0% of [6] and 67.3% of the conventional FCA. The reason why the proposed FCA's settling time is much shorter than [6] is that the proposed FCA completely eliminates the long recovery time after slewing phase completes while [6] does not when  $I_b < I_{tail}/4$  in Figure 5.2. The complex frequency compensation of [6] also degrades its settling time performance. Compared to [6], the proposed FCA does not need any frequency compensation. This not only makes the FCA



design much simpler but also saves a considerable amount of area that would have been consumed by compensation capacitors and resistors. In terms of figure of merits, in comparison with [6], the proposed FCA increases FOM<sub>s</sub>, FOM<sub>L</sub>, FOM<sub>Ts\_0.1%</sub> and FOM<sub>Ts\_0.01%</sub> by 1.31, 3.69, 2.4 and 3.04 times. Compared with the conventional FCA, the proposed FCA improves FOM<sub>s</sub>, FOM<sub>L</sub>, FOM<sub>Ts\_0.1%</sub> and FOM<sub>Ts\_0.01%</sub> by 2.0, 5.5, 2.86 and 0.27 times. In addition, the FOM<sub>noise</sub> of the proposed FCA is also reduced to 71% of [6] and 52% of the conventional FCA.

The simultaneous performance improvement on CUE and settling time by the proposed FCA demonstrates its clear advantages over [6] and the conventional FCA.

## **5.6.Discussion**

In summary, compared to [6], the proposed FCA design has the following benefits.

- 1) No long recovery time is needed even when the cascode stage's bias current is only 1/12 of  $I_{tail}$ . The method in [6] starts to suffer from a long recovery time when its cascode stage's bias current becomes less than  $0.5*I_{tail}$ .
- 2) The design involves much less complexity given that the complex frequency compensation in [6] is omitted.
- Area consumption decreases significantly given that no large compensation capacitors are used.
- Power consumption for the cascode stage is lowered because the nondominant poles associated with the differential-to-single-ended conversion circuit in the proposed FCA are at higher frequencies.
- 5) The proposed design has good compatibility with potentially additional gain enhancement circuits mentioned in Chapter 2.





**Reduce M14 leakage** 

Figure 5.24: A circuit to reduce leakage current of M14 in the turn-around stage There is also a potential limitation to the proposed FCA design when a very high DC gain is needed. As mentioned in Section 5.4.2, transistor M14 in Figure 5.8 could be weakly on at the corner of fast NMOS when T=85°C. When M14 is on in the quiescent operation, the output impedance of the FCA is reduced, which ultimately limits the largest achievable DC gain. This can be solved by replacing M14 with a very high threshold voltage device if it is available in the process. Otherwise, the leakage issue can be solved by adding transistors M12 and M24 to the circuit as shown in Figure 5.24. In the quiescent operation, M13 works in the triode region and M12 works in the cutoff region so the gate voltage of M14 is Vss. This minimizes the leakage current from M14 so as to increase the largest achievable DC gain. As for the largesignal operation of the circuits in Figure 5.24, it functions similarly to the turn-around stage in Figure 5.8. This circuit will be further discussed in length in Chapter 6.



#### 5.7.Summary

A new and simple turn-around stage to effectively improve a FCA's current utilization efficiency (CUE) has been introduced. The proposed FCA does not suffer from a long recovery time though the FCA's bias current is only 8.3% of the FCA's tail current. In addition, the settling performance of the proposed FCA is also improved due to larger average slew rate (SR) brought out by the new turn-around stage. Furthermore, as byproducts from a reduced bias current in the cascode stage, the noise and offset of the proposed FCA are also improved. Compared to [6], the proposed FCA increases CUE and SR by 1.33 and 2.7 times. The proposed FCA's settling time with 0.1% accuracy and 0.01% are decreased to 58% and 45% of [6]. The theoretical calculations for the proposed FCA highly agree with its simulation results.

Due to its design simplicity, high CUE, low noise, and low offset voltage, the proposed FCA is well suitable for applications and systems where FCAs are used as single-stage amplifiers or the first stage in multi-stage amplifiers. The applications include but not limited to switched-capacitor circuits, battery monitoring circuits, load current sensing circuits, LDO error amplifiers, and sigma-delta ADC.

#### **5.8.References**

- R. S. Assaad and J. Silva-Martinez, "The Recycling Folded Cascode: A General Enhancement of the Folded Cascode Amplifier," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 9, pp. 2535-2542, Sept. 2009
- [2]. P. Y. Wu, V. S.-L. Cheung, and H. C. Luong, "A 1-V 100-MS/s 8-bit CMOS switchedopamp pipelined ADC using loading-free architecture," *IEEE J. Solid-State Circuits*, vol. 42, no. 4, pp. 730–738, Apr.2007.



- [3]. PE. Allen and DR. Holberg, *CMOS analog circuit design*, Second Edition, pp.307, Oxford Univ. Press; 2002.
- [4]. B. Razavi, *Design of analog CMOS integrated circuits*, International Edition, pp. 458, 2001
- [5]. W. Sansen, Analog design essentials, Vol. 859, pp. 141, Springer Science & Business Media, 2007.
- [6]. R. Eschauzierand and NV. Rijn. "Apparatus and method for a compact class AB turnaround stage with low noise, low offset, and low power consumption," U.S. Patent No. 6,624,696. 23 Sep. 2003.
- [7]. C. C. Enz and G. C. Temes, "Circuit techniques for reducing the effects of op-amp imperfections: autozeroing, correlated double sampling, and chopper stabilization," in Proceedings of the IEEE, vol. 84, no. 11, pp. 1584-1614, Nov 1996.



# CHAPTER 6. COMBINED PERFORMANCE ENHANCEMENT TECHNIQUES FOR FOLDED CASCODE AMPLIFIERS

Many applications such as continuous-time sigma delta ADCs require an op amp with high gain, high slew rate, low noise, low offset, low power and large input common mode range (ICMR). In this chapter, a single-stage folded cascode amplifier (FCA) is designed with these three techniques combined: the proposed gain enhancement (GE) technique in Chapter 2, the slew rate enhancement (SRE) technique in Chapter 3 and the current utilization efficiency (CUE) enhancement technique in Chapter 5. The purposes of combining the techniques are twofold: a) confirm that these three proposed techniques are compatible; and b) confirm that a FCA with the combined techniques can simultaneously have high DC gain, high slew rate, low noise, low offset and low power.

#### **6.1.Schematic Design**

Figure 6.1 shows the schematic of the proposed FCA combining techniques of GE, SRE and CUE enhancement. The proposed FCA consists of a FCA core formed by transistors M0-M10, a GE circuit formed by transistors M21-M23, an additional turn-around stage formed by transistors M12-M14, and a negative SRE circuit. The negative SRE circuit is shown in Figure 6.2.

The additional turn-around stage is normally off and is only activated during the FCA's positive slewing phase. Such design allows the FCA's current conveyance capability to be greatly enhanced during the positive slewing phase while at the same time keeping the bias current consumption of the turn-around stage to a minimum and generating very low noise and offset voltage. As a result, the bias current of the FCA's cascode stage can be reduced to a current much smaller than I<sub>tail</sub>. The cascode stage's bias current is annotated as  $2\alpha^*$ I<sub>tail</sub>, where



I<sub>tail</sub> is the drain current of transistor M0. The smaller  $\alpha$  is, the less the noise, offset voltage and power consumption of the FCA are. However,  $\alpha$  cannot be indefinitely small because it affects the frequencies of the nondominant pole associated with node V<sub>x</sub> as discussed in Chapter 5. Therefore, a proper value of  $\alpha$  must be selected. In this design,  $\alpha = 1/12$ .



Figure 6.1: Schematic of the proposed FCA with gain, slew rate and CUE enhancement



Figure 6.2: Schematice of the negative SRE circuit for the proposed FCA



In the FCA, there are three signal paths from the FCA's inputs to its output. The first signal path, as shown by the blue arrow line, always conducts signal to the output whenever a differential input voltage exists. But the second signal path, as marked by the green arrow line, is activated only when  $V_{id}>V_{on_pos}$  or  $\Delta V_x > \Delta V_{x,on_pos}$ .  $V_{id}$  is the differential input voltage.  $\Delta V_{x_pos}$  is the positive voltage change at  $V_x$  node upon application of a positive  $V_{id}$  at the input pair.  $V_{on_pos}$  and  $\Delta V_{x,on_pos}$  are respectively the positive threshold voltages of  $V_{id}$  and  $\Delta V_x$  required to activate the turn-around stage. The third signal path, as marked by the red line, is the negative SRE path. This path is activated only when  $V_{id} < V_{on_neg}$ . Similar to the definition of  $V_{on_pos}$ ,  $V_{on_neg}$  is the negative threshold voltage of  $V_{id}$  required to activate the negative SRE path. This path is activated only when  $V_{id} < V_{on_neg}$ . Similar to the definition of  $V_{on_pos}$ ,  $V_{on_neg}$  is the negative threshold voltage of  $V_{id}$  required to activate the negative SRE circuit. The details about the workings of the signal paths in the quiescent, small-signal and large-signal operations are discussed below.

Transistor M13 is designed to carry half of the bias current as transistor M8 but with the same size. As a result, M13 works in the triode region in the quiescent operation, which leads to a low drain source voltage for M13 or makes  $V_y$  approximate  $V_x$ . When the DC bias voltage of  $V_x$  is kept less than transistor M21's threshold voltage, transistor M21 works in the cutoff region. As a result, transistor M22 works in the triode region and transistor M14 works in the cutoff region. Therefore,  $V_z$  approximates  $V_{ss}$  and the turn-around stage is off in the quiescent operation.  $V_z$  is so close to  $V_{ss}$  that transistor M14's leakage current is minimized, which consequently improves the maximum achievable DC gain for the proposed FCA.

Upon application of a positive differential input signal,  $V_{id}$ , the source voltage of transistor M13 would increase by  $\Delta V_x$ . Transistor M13 stays in the triode region, and transistor M21 stays in the triode region, and the turn-around stage remains off before  $V_{id}$  and  $\Delta V_x$  become as big as  $V_{on_pos}$  and  $\Delta V_{x,on_pos}$  respectively. When  $V_{id}=V_{on_pos}$  and  $\Delta V_x=\Delta V_{x,on_pos}$ , the operation



regions of transistors M13, M21 and M22 transit from their original operation regions (triode, cutoff, triode) to the saturation region. As a result, any V<sub>id</sub>>V<sub>on</sub> will quickly increase the drain voltage of M13 and the drain current of M21, which turns on the turn-around stage. Therefore, the boundary between the disabling and enabling the turn-around stage can be approximately marked by the transitions of M13, M21 and M22's operation regions from their quiescent operation regions to the saturation region. At this transition point, the drain current of M8 and M13 are respectively expressed as (6-1) and (6-2), where  $\beta_8 = \mu_n C_{ox} W_8 / L_8$  and  $\beta_{13} = \mu_n C_{ox} W_{13}/L_{13}$ . Also,  $V_{od8}$  and  $\Delta I_{d8}$  are respectively transistor M8's overdrive voltage and drain current change. The value of  $\lambda$  is 0.4\*I<sub>bias</sub>/( $\alpha$ \*I<sub>tail</sub>) =0.2. By dividing (6-1) by (6-2) and substituting  $\beta_{13}=\beta_8$ , it is found that  $\Delta I_{d8}=-(1+\lambda) * \alpha I_{tail}\approx -0.1* \alpha I_{tail}$  and  $\Delta V_{x,on_{pos}}=1 \sqrt{0.5(1-\lambda)}\approx 0.37*V_{od8}\approx 26mV$ . At the transition point, M14 is still off and the drain current change of M8 and M13 comes from the input differential pair. Therefore, the input referred turn-on voltage, V<sub>on</sub>, for the turn-around stage is calculated as (6-3) by solving the KCL equation at M13's source node. In (6-3),  $g_{m1}$  and  $V_{od1}$  are respectively the transconductance and overdrive voltage of transistor M1. In addition, A4 and A3 are respectively the aspect ratios of transistors M4 and M3. In this design,  $A_4/A_3=5/4$ . As a result,  $V_{on}$  is found to be about 7mV, assuming that  $V_{od1}$  is in the neighborhood of 70~80mV.

$$\left(V_{\text{od8}} - \Delta V_{\text{x,on_pos}}\right)^2 * 0.5\beta_8 = 2\alpha * I_{\text{tail}} + \Delta I_{\text{d8}}$$
(6-1)

$$(V_{od8} - \Delta V_{x,on\_pos})^2 * 0.5\beta_{13} = (V_{od8} - \Delta V_{x,on\_pos})^2 * 0.5\beta_8 = \alpha I_{tail}(1 - \lambda)$$
 (6-2)

$$V_{\text{on}\_\text{pos}} = -\frac{\Delta I_{\text{d8}} + \Delta I_{\text{d13}}}{\frac{g_{\text{m1}}}{2} \left(1 + \frac{A_4}{A_3}\right)} = \frac{\alpha(1 + 2\lambda)I_{\text{tail}}}{\frac{I_{\text{tail}}}{2V_{\text{od1}}} \left(\frac{A_4}{A_3} + 1\right)} = \frac{0.23 * V_{\text{od1}}}{\left(\frac{A_4}{A_3} + 1\right)} \approx 7.8 \text{mV}$$
(6-3)

When  $V_{id}$  increases to a point where  $V_{id}$ > $V_{on_pos}$ , transistor M14 turns on, transistor M13 works in the saturation region, and the negative feedback loop formed by transistors M21-M22



and M13- M14 is activated. As a result,  $\Delta V_x$  stays as  $\Delta V_{x,on}$  regardless of the differential current from M1 and M2,  $I_{dm}$ , because the negative feedback loop makes M14 compensate  $I_{dm}$ . Therefore, in the positive slewing phase, the drain currents of transistors M8 and M14 respectively become  $\alpha I_{tail}(1-\lambda) = I_{tail}/15$  and  $I_{tail}[1+(1-\lambda-A_4/A_3)*2\alpha] \approx I_{tail}$ . This enhances the positive slew rate of the FCA. Once the FCA's output voltage decreases to a point where V<sub>id</sub><V<sub>on\_pos</sub>, the FCA's turn-around stage gets deactivated and transistor M13 returns to work in the triode region. One thing to note is that transistor M8 always holds a small drain current of  $(1-\lambda)*I_{tail}=I_{tail}/15$ , which prevents M8 from ever turning off and keeps the voltage change at  $V_x$  as small as  $0.37*V_{od8}\approx 26$ mV. As a result, input transistor M1 does not work in the triode region in the slewing phase even when the input common mode voltage (ICMV) is close to the negative supply rail. Therefore, although the proposed FCA has an extremely small cascode bias current, it does not require a long time for the current to recover after the slewing phase completes, since a long recovery time is generally caused by either transistor M8 working in the cutoff region or transistor M1 working in triode region but neither condition applies to the proposed FCA. As a matter of fact, the settling time is slightly improved because the positive SR is increased by setting the current mirror ratios of M14-to-M15 and M20-to-M18 as larger than 1.

In the negative slewing phase, transistor M2 steers all the tail current into transistor M3, and then transistor M4 passes the mirrored current to discharge the load capacitor via transistor M8. In this slewing phase, the drain currents of transistors M8 and M10 are respectively  $[A_4/A_3^*(2\alpha+1)-\alpha]^*I_{tail}$  and  $2\alpha^*I_{tail}$ , which results in a net discharging current of  $[A_4/A_3^*(2\alpha+1)-3\alpha]^*I_{tail}$  to the load capacitor. This discharge current is slightly larger than that of the conventional FCA. The conventional FCA's discharging current is  $I_{tail}$  when its cascode



bias current is larger than  $0.5*I_{tail}$ . More importantly, in the negative slewing phase, the negative SRE circuit shown in Figure 6.2 also turns on to increase the transient discharging current to the load capacitor. The details about the operation principles of the negative SRE circuit are described next.

The negative SRE circuit of the FCA is shown in Figure 6.2. As can be seen, the quiescent bias currents of transistors M31, M32, M35, M37 and M41 are the same, I<sub>bias</sub>. Transistor M37's source voltage is designed to be less than transistor M38's threshold voltage in the quiescent operation. As a result, transistors M38, M39 and M40 work in the cutoff region in the quiescent operation. Transistors M35 and M36 are a matched input pair, so the M36's drain current in the quiescent operation is the same as M35's bias current. As a result, the total drain current of transistors M36 and M42 is 2\* I<sub>bias</sub>. This 2\* I<sub>bias</sub> is smaller than the intended bias current of transistor M46, 2.4\*I<sub>bias</sub> when M46 works in the saturation region. Therefore, transistors M46, M42 and M43 work in the triode, triode and cutoff regions respectively, which ensures zero bias current in transistor M43 to keep the negative SRE circuit off in the quiescent operation.

However, upon application of a negative differential input signal to the input pair,  $V_{id}$ , transistor M36's drain current increases while transistor M35's drain current stays the same as I<sub>bias</sub>. The reason is that the negative feedback loop formed by M32, M34, M35, M37 and 2\*I<sub>bias</sub> always adjusts transistor M34's drain current to maintain transistor M35's drain current as a constant of I<sub>bias</sub>. When V<sub>id</sub> decreases to a point,  $V_{on_neg}$ , that the drain current of M36 increases by 0.4\*I<sub>bias</sub>, the operating regions of transistors M46 and M42 transit from the triode region to the saturation region. Any further increase of M36's drain current caused by further increase of V<sub>id</sub> flows into transistor M43 and is then amplified by the aspect ratio of transistors M44 and M43. The amplified drain current is passed to the load capacitor C<sub>L</sub> via transistors M44 and



M45. Therefore, the boundary between enabling and disabling the negative SRE circuit can be marked by M46 and M42's operation regions transitioning from the triode region to the saturation region. According to this definition, the input referred turn-on threshold voltage of the negative SRE circuit,  $V_{on_neg}$ , is calculated as (6-6) by solving (6-4) and (6-5), where  $V_{od36}$  is transistor M36's quiescent overdrive voltage. Assuming  $V_{od36}$  is about 70mV~80mV, the calculated  $V_{on_neg}$  is about -12mV.

$$\left( V_{\text{od}36} - \Delta V_{\text{on\_neg}} \right)^2 * 0.5\beta_{36} = 1.4 * I_{\text{bias}}$$
 (6-4)

$$(V_{od36})^2 * 0.5\beta_{36} = 1.0 * I_{bias}$$
(6-5)

$$\Delta V_{\text{on_neg}} = \left(1 - \sqrt{1.4}\right) * V_{\text{od}36} \approx -12\text{mV}$$
(6-6)

In order to improve the DC gain of the proposed FCA, a GE circuit via conductance cancellation is implemented as shown in Figure 6.1. The forward path of the GE circuit is formed by transistors M23-M25 and the feedback path reuses transistors M7 and M3-M4 in the FCA core to form a flipped voltage attenuator (FVA). In the GE forward path, the voltage change at  $V_x$  node is sensed and shifted up to voltage  $V_{fb}$  via the level shifter formed by transistors M23-M25. In the GE feedback path, voltage  $V_{fb}$  feedbacks to the  $V_x$  node through the FVA. The voltage gain from  $V_{fb}$  to node 2's voltage,  $V_2$  is calculated as (6-7), with which the generated negative conductance is derived as (6-8). As a result, the net conductance looking down from the source of M8,  $g_x$  is obtained as (6-9). In order to maximize the FCA's DC gain,  $g_x$  should be designed to be close to zero but slightly negative.

$$\frac{V_2}{V_{fb}} \approx -\frac{g_{m7}(g_{ds2} + g_{ds3})}{(g_{m7} + g_{ds7})g_{m3}} \approx -\frac{g_{ds2} + g_{ds3}}{g_{m3}}$$
(6-7)

$$g_{\text{neg}} = \frac{V_2}{V_{\text{fb}}} * g_{\text{m4}} = -\frac{g_{\text{ds2}} + g_{\text{ds3}}}{g_{\text{m3}}} * g_{\text{m4}} \approx -g_{\text{ds4}} - g_{\text{ds2}} * \frac{A_4}{A_3}$$
(6-8)



$$g_{x} = g_{ds4} + g_{ds1} + g_{ds12} + g_{neg} \approx g_{ds12} - g_{ds2} \left(\frac{A_{4}}{A_{3}} - 1\right)$$
  
=  $g_{ds12} - \frac{g_{ds2} * 2\alpha}{0.5 + 2\alpha}$  (6-9)

## **6.2.Frequency Response Analysis**

In order to understand the frequency response of the proposed FCA in Figure 6.1, its small signal block diagram is drawn in Figure 6.3. In the following analysis, the following assumptions are made:

- The transconductance of transistors M1-M13 and M23-M25 are much larger than their conductance counterpart. For example, g<sub>mi</sub>>>g<sub>dsi</sub>, where g<sub>mi</sub> and g<sub>dsi</sub> are respectively transistor M<sub>i</sub>'s transconductance and conductance.
- 2) The amount of parasitic capacitance at node  $V_X$  and  $V_1$  are the same
- Load capacitor, C<sub>L</sub>, is much larger than the parasitic capacitance at the FCA's internal nodes. For example, C<sub>L</sub>>>C<sub>1</sub>, C<sub>2</sub> and C<sub>X</sub>



Figure 6.3: Small signal block diagram of the proposed FCA

The transfer function from  $V_x$  to  $V_{fb}$ , u(s), is found as (6-10). The transfer function has one pole and one zero, located at frequencies of  $0.5*f_T$  and  $f_T$  respectively, where  $f_T$  is the unity



current gain frequency of transistor M23. Since transistor M23's f<sub>T</sub> is about 100MHz in this design and is much higher than the proposed FCA's GBW (2.4MHz), the transfer function, u(s), can be simplified as 1 for frequencies less than the GBW. In order to derive the transfer function from the FCA's inputs to output,  $\frac{V_0}{V_{id}}$ , KCL equations at nodes V<sub>1</sub>, V<sub>2</sub>, Vx and V<sub>0</sub> are derived and written as (6-11) to (6-14), where g<sub>i</sub> and Ci are respectively the impedance and parasitic capacitance at node i. The expressions of g<sub>1</sub>, g<sub>2</sub>, g<sub>x</sub>, g<sub>L</sub>, C<sub>1</sub>, C<sub>2</sub> and C<sub>x</sub> are shown in Table 6.1. After solving the KCL equations (6-11) to (6-14), the transfer function  $\frac{V_0}{V_{id}}$  is derived as (6-15) and rewritten as (6-16). Equation (6-16) is further simplified as (6-17) by substituting (6-18) into (6-16).

Table 6.1: Expressions of the conductance and capactance in the proposed FCA

| $g_1 = g_{ds2} + g_{ds3}$                                   | $C_1 \approx C_{db2} + C_{gd2} + C_{db3} + C_{gd3} + C_{gs7}$                                  |
|-------------------------------------------------------------|------------------------------------------------------------------------------------------------|
| $g_2 \approx g_{ds5} g_{ds9}/g_{m9}$                        | $C_2 \approx C_{gs3} + C_{gd3} + C_{gs4} + C_{gd4}$                                            |
| $g_x \approx g_{ds1} + g_{ds4} + g_{ds11}$                  | $C_x \approx C_{db1} + C_{gd1} + C_{db4} + C_{gd4} + C_{gs8} + C_{gs13} + C_{gd14} + C_{gd14}$ |
| $g_L \approx g_{ds6} g_{ds10}/g_{m10} + g_x g_{ds8}/g_{m8}$ |                                                                                                |

$$u(s) = \frac{V_{fb}}{V_x} \approx \frac{\left(1 + s\frac{C_{gs23}}{g_{m23}}\right)}{1 + s\frac{C_{gs23} + C_{gs7}}{g_{m23}}} \approx \frac{\left(1 + s\frac{C_{gs23}}{g_{m23}}\right)}{1 + s\frac{2C_{gs23}}{g_{m23}}} = 1$$
(6-10)

$$\frac{g_{m1}V_{id}}{2} + V_1(g_1 + sC_1) + g_{m7}V_1 + g_{ds7}(V_1 - V_2) + g_{m3}V_2 - g_{m7}V_x * u(s) = 0$$
(6-11)

$$V_1 * g_{m7} + (V_1 - V_2) * g_{ds7} - V_2(g_2 + sC_2) = 0$$
(6-12)

$$V_2 * g_{m4} + V_x(g_{m8} + g_{d88} + g_x + sC_x) - V_0 * g_{d88} - \frac{V_{id}}{2} * g_{m1} = 0$$
(6-13)

$$V_{o}(g_{L} + g_{ds8} + sC_{L}) = V_{x}(g_{m8} + g_{ds8})$$
(6-14)

$$\frac{V_{o}}{V_{id}} \approx \frac{0.5g_{m1}g_{m8}[s^{2}C_{1}C_{2} + g_{m7}sC_{2} + (g_{m3} + g_{m4})g_{m7} + g_{m7}g_{m4}u(s)]}{(g_{L} + sC_{L})(g_{m8} + sC_{x})[s^{2}C_{1}C_{2} + sg_{m7}C_{2} + g_{m3}g_{m7} + g_{m7}g_{m4}u(s)]}$$
(6-15)



$$\frac{V_{o}}{V_{id}} \approx \frac{\frac{g_{m1}}{2 * g_{L}}}{(1 + s\frac{C_{L}}{g_{L}})\left(1 + s\frac{C_{x}}{g_{m8}}\right)} * \frac{s^{2} + \frac{g_{m7}}{C_{1}}s + \frac{(g_{m3} + 2g_{m4})g_{m7}}{C_{1}C_{2}}}{s^{2} + \frac{g_{m7}}{C_{1}}s + \frac{(g_{m3} + g_{m4})g_{m7}}{C_{1}C_{2}}}$$
(6-16)

$$\frac{V_{o}}{V_{id}} = \frac{\frac{g_{m1}}{2 * g_{L}}}{(1 + s\frac{C_{L}}{g_{L}})\left(1 + \frac{s}{k_{2}GBW}\right)} * \frac{s^{2} + k_{2}GBWs + k_{1}k_{2}(1 + 2k_{3})GBW^{2}}{s^{2} + k_{2}GBWs + k_{1}k_{2}(1 + k_{3})GBW^{2}}$$
(6-17)

$$k_1 = \frac{\frac{g_{m3}}{C_2}}{GBW}, \quad k_2 = \frac{\frac{g_{m7}}{C_1}}{GBW} = \frac{\frac{g_{m8}}{C_x}}{GBW}, \quad k_3 = \frac{g_{m4}}{g_{m3}}, GBW = \frac{g_{m1}}{C_L} * \frac{k_3 + 1}{2}$$
 (6-18)

As can be seen from (6-17), there are four poles and two zeros in this proposed FCA's transfer function. The frequencies of all the poles and zeros are respectively calculated as (6-20) to (6-24). Because drain current of transistor M7 is much smaller than that of transistor M3,  $k_1 > k_2$  and  $\left(\frac{k_2}{2}\right)^2 < k_1 k_2$ . As a result, the poles and zeros (P<sub>nd2</sub>, P<sub>nd3</sub>, Z<sub>nd1</sub> and Z<sub>nd2</sub>) are complex poles and zeros. The natural frequencies of the complex poles and zeros are respectively  $GBW * \sqrt{k_1k_2(1+k_3)}$  and  $GBW * \sqrt{k_1k_2(1+2k_3)}$ , which are higher than those of the proposed FCA in Chapter 5 because  $k_3>0$ . The results of the higher frequencies of the nondominant poles and zeros are brought by the additional GE path in this proposed FCA. In addition, compared with the proposed FCA in Chapter 5, this proposed FCA also has a higher frequency of P<sub>nd1</sub> due to its larger bias current in its cascode stage. The distribution of all the poles and zeros of this proposed FCA in S-plane is shown in Figure 6.4. As can be seen, the complex poles are with a lower natural frequency and a lower Q-factor compared with the complex zeros. But the complex poles and zeros are so close to each other that the phase drop caused by them is minimal, as calculated by (6-25). Figure 6.5 illustrates the dependency of the phase drop on the ratio of k<sub>2</sub> to the FCA's GBW, from which we can see that the phase drop is less than 2.5 degrees even when  $k_1=2*k_2$  and  $k_2$  is as low as 2. The entire FCA's phase



margin is calculated as (6-26) and its dependency on the ratio of  $k_2$  to the FCA's GBW is shown in Figure 6.6. In this design,  $k_1=2$ ,  $k_2=4$  and  $k_3=1.25$ . Therefore, the expected phase margin of the op amp is about 71 degrees.



Figure 6.4: Distribution of the proposed FCA's poles and zeros

$$P_{d1} = -\frac{g_L}{C_L} \tag{6-19}$$

$$P_{nd1} = -\frac{g_{m8}}{C_x} = -k_2 * GBW$$
(6-20)

$$P_{nd2} = -GBW * \left(\frac{k_2}{2} - \sqrt{\left(\frac{k_2}{2}\right)^2 - k_1 k_2 (1 + k_3)}\right)$$
(6-21)

$$P_{nd3} = -GBW * \left(\frac{k_2}{2} + \sqrt{\left(\frac{k_2}{2}\right)^2 - k_1 k_2 (1+k_3)}\right)$$
(6-22)

$$Z_{nd1} = -GBW * \left(\frac{k_2}{2} - \sqrt{\left(\frac{k_2}{2}\right)^2 - k_1 k_2 (1 + 2k_3)}\right)$$
(6-23)

$$Z_{nd2} = -GBW * \left(\frac{k_2}{2} + \sqrt{\left(\frac{k_2}{2}\right)^2 - k_1 k_2 (1 + 2k_3)}\right)$$
(6-24)

$$\emptyset = -\tan^{-1}\left\{\frac{k_2}{k_1k_2(1+k_3)-1}\right\} + \tan^{-1}\left\{\frac{k_2}{k_1k_2(1+2k_3)-1}\right\}$$
(6-25)



$$PM = 90 - \tan^{-1}\left(\frac{1}{k2}\right) - \tan^{-1}\left\{\frac{k_2}{k_1k_2(1+k_3)-1}\right\} + \tan^{-1}\left\{\frac{k_2}{k_1k_2(1+2k_3)-1}\right\}$$
(6-26)  
Complex pole&zero pairs phase drop vs. k2 and k1  
  
or the second s

Figure 6.5: Phase drop due to complex poles and zeros vs. k1 and k2



Figure 6.6: The FCA's PM vs. k1 and k2

# **6.3.Noise Analysis**

The noise of the proposed FCA is analyzed in comparison with the conventional fast FCA in Figure 5.5(b) so as to understand the noise reduction brought by the bias current reduction in the cascode stage of the proposed FCA. The noise model of the proposed FCA is shown in



Figure 6.7 after neglecting the noise contributed by the cascode transistors and the transistors working in the cutoff region. The proposed FCA's output current noise power is derived as (6-27), where a transistor's voltage noise power is expressed as (6-28). The  $\frac{8KT}{3g_{mi}}$  and  $\frac{K_f}{W_1L_1C_{oxf}}$  in (6-28) respectively represent a transistor's thermal and flicker noise. The transistors in current mirrors are typically sized to have the same length and current density. Consequently, their widths and transconductance linearly scale with their bias currents. Therefore, their voltage noise power is linearly proportional to their bias currents, whereas their current noise power is inversely proportional to their bias currents, as shown in (6-28) and (6-29). As a result, the noise expression in (6-30) can be established. After plugging (6-30) into (6-27), the equation (6-27) is simplified as (6-31). Equation (6-31) is further simplified as (6-32) by neglecting the term of  $\frac{l_{BS}^2}{8\alpha}(k_3 - 1)^2$  because this term is much smaller than  $l_{n5}^2(k_3^2 + 2)$ . As a result, the input referred voltage noise power of the FCA is derived as (6-33).



Figure 6.7: Noise model for the proposed op amp



$$I_{\text{no,prop}}^{2} \approx \frac{g_{\text{m0}}^{2}e_{\text{n0}}^{2}}{4} \left(\frac{g_{\text{m4}}}{g_{\text{m3}}} - 1\right)^{2} + \left(g_{\text{m3}}^{2}e_{\text{n3}}^{2} + g_{\text{m2}}^{2}e_{\text{n2}}^{2} + g_{\text{m5}}^{2}e_{\text{n5}}^{2}\right) * \frac{g_{\text{m4}}^{2}}{g_{\text{m3}}^{2}} + \left(g_{\text{m4}}^{2}e_{\text{n4}}^{2} + g_{\text{m1}}^{2}e_{\text{n1}}^{2} + g_{\text{m6}}^{2}e_{\text{n6}}^{2} + g_{\text{m1}}^{2}e_{\text{n1}}^{2} + g_{\text{m2}}^{2}e_{\text{n2}}^{2}\right)$$
(6-27)

$$\frac{e_{ni}^2}{\Delta f} = \frac{8KT}{3g_{mi}} + \frac{K_f}{W_i L_i C_{ox} f} \propto \frac{1}{I_{bias}}$$
(6-28)

$$I_{ni}^{2} = \frac{e_{ni}^{2}g_{mi}^{2}}{\Delta f} = \frac{g_{mi} * 8KT}{3} + \frac{g_{mi}^{2} * K_{f}}{W_{i}L_{i}C_{ox}f} \propto I_{bias}$$
(6-29)

$$I_{n5}^{2} = I_{n6}^{2} = 2I_{n11}^{2} = 2I_{n25}^{2} = 2\alpha I_{n0}^{2}; \ I_{n1}^{2} = I_{n2}^{2}; \ I_{n3}^{2} = \frac{I_{n4}^{2}}{k_{3}}$$
(6-30)

$$I_{no,prop}^{2} \approx \frac{I_{n5}^{2}}{8\alpha} (k_{3} - 1)^{2} + (I_{n3}^{2} + I_{n1}^{2} + I_{n5}^{2}) * k_{3}^{2} + (k_{3}I_{n3}^{2} + I_{n1}^{2} + 2I_{n5}^{2})$$
(6-31)

- 7

المسلوك للاستشارات

$$I_{no,prop}^{2} \approx I_{n1}^{2} * (1 + k_{3}^{2}) + I_{n3}^{2}(k_{3} + k_{3}^{2}) + I_{n5}^{2}(2 + k_{3}^{2})$$
(6-32)

$$V_{ni}^{2} = \frac{I_{no,prop}^{2}}{[0.5 * g_{m1} * (k_{3} + 1)]^{2}} = \frac{I_{n1}^{2} * (1 + k_{3}^{2}) + I_{n3}^{2}(k_{3} + k_{3}^{2}) + I_{n5}^{2}(2 + k_{3}^{2})}{0.5 * g_{m1} * (k_{3} + 1) * GBW * C_{L}}$$
(6-33)

$$V_{no}^{2} = V_{ni}^{2} * \frac{\pi GBW}{2 * 2\pi} = \frac{[I_{n1}^{2} * (1 + k_{3}^{2}) + I_{n3}^{2}(k_{3} + k_{3}^{2}) + I_{n5}^{2}(2 + k_{3}^{2})]}{2g_{m1} * (k_{3} + 1) * C_{L}}$$
(6-34)

$$V_{\text{no,thermal}}^{2} = \frac{4\text{KT}}{3} \frac{\left[(1+k_{3}^{2}) + a(k_{3}+k_{3}^{2}) + b(2+k_{3}^{2})\right]}{(k_{3}+1)\text{C}_{L}} \approx \frac{2.4\text{KT}}{\text{C}_{L}}$$
(6-35)

$$V_{\text{no,thermal,conv}}^{2} = \frac{\frac{4\text{KT}}{3} * [2 + 2g'_{\text{m3}}/g_{\text{m1}} + 2g'_{\text{m5}}/g_{\text{m1}}]}{2C_{\text{L}}}$$

$$= \frac{4\text{KT}}{3C_{\text{L}}} * \left[1 + a * \frac{r + 0.5}{2\alpha + 0.5} + b * \frac{r}{2\alpha}\right] = \frac{4.5\text{KT}}{C_{\text{L}}}$$
(6-36)

When the FCA is placed in a positive unity gain buffer structure, the equivalent rectangular noise bandwidth of the FCA is GBW/4, where  $GBW = 0.5g_{m1}(k_3 + 1)/C_L$ . Therefore, the inband output referred voltage noise power of the proposed FCA is calculated as (6-34), in which the dominant noise source for a wideband FCA is thermal noise. The in-band thermal noise of the FCA is calculated as (6-35). This equation suggests that a, b and k3 should be minimized

in order to minimize the in-band thermal noise for a given load capacitor. In this design, a= gm3/gm1=0.4, b=gm5/gm1=0.14, k<sub>3</sub>=5/4 and  $\alpha=1/12$ . As a result, the proposed FCA's in-band thermal noise is calculated as 2.4KT/C<sub>L</sub> or 99uV at T=300K and CL=1pF after plugging a, b and k3 into (6-35).

Similarly, the thermal noise of the conventional FCA counterpart in Figure 5.5(b), is found as (6-36), where gm3' and gm5' are transconductance of transistors M3 and M5 in the conventional FCA counterpart. With a typical bias current of r\*Itail=0.67\*Itail for the conventional FCA's cascode stage, it can be found that  $\frac{g'_{m3}}{g_{m1}} = a * \frac{r+0.5}{2\alpha+0.5}$  and  $\frac{g'_{m5}}{g_{m1}} = b * \frac{r}{2\alpha}$ . As a result, the integrated thermal noise voltage of the conventional FCA is obtained as 3.2KT/C<sub>L</sub> or 115uV at T=300K and CL=1pF after plugging a=0.4, b=0.07,  $\alpha$ =1/12, and r=0.67. Therefore, compared to the conventional FCA, the proposed FCA is expected to reduce inband noise voltage by 14%.

# **6.4.Offset Voltage Analysis**

The variance of transistor  $M_i$ 's threshold voltage and  $\Delta\beta_i/\beta_i$  are expressed as (6-37), where  $\beta_i = \mu C_{ox} W_i/L_i$ . In addition,  $A_{thi}^2$  and  $A_{\beta i}^2$  are mismatch coefficients, fixed parameters for a given process, of transistor  $M_i$ 's threshold voltage and feature sizes. Transistor  $M_i$ 's drain current variation due to its random mismatch is shown in (6-38), where  $I_{di}$  and  $V_{odi}$  are respectively the transistor's quiescent current and overdrive voltage. Based on the sizing strategy of fixed current density for the transistor  $M_i$ , Equation (6-39) shows that transistor  $M_i$ 's drain current variation is proportional to its bias current. The larger the bias current is, the larger the drain current variation is.

The input referred offset voltage of a FCA can be analyzed in a very similar manner to how noise is analyzed in section 6.4. The proposed FCA's output current variation caused by the



mismatches of transistors (M1-M6), M11 and M25 is shown as (6-39). Therefore, its input referred offset voltage,  $V_{os,prop}$ , is calculated as (6-40). In (6-40),  $c = I_{os3}^2/I_{os1}^2$  and  $d = I_{os5}^2/I_{os1}^2$ . Similarly, the input referred offset voltage for the conventional FCA,  $V_{os,conv}$ , in Figure 5.5(b) is calculated as (6-41), in which r=0.67 and  $\alpha$ =1/12. Compared to  $V_{os,conv}$ , it is clear that  $V_{os,prop}$  is reduced due to the reduced offset contribution from transistors M3 and M5. This will also be confirmed by the Monte Carlo simulation results.

$$\sigma_{\text{vthi}}^2 = \frac{A_{\text{thi}}^2}{W_i L_i} \quad , \quad \sigma^2(\frac{\Delta\beta_i}{\beta_i}) = \frac{A_{\beta i}^2}{W_i L_i} \tag{6-37}$$

$$I_{osi}^{2} = \sigma_{vthi}^{2} g_{mi}^{2} + \sigma^{2} \left(\frac{\Delta\beta_{i}}{\beta_{i}}\right) I_{di}^{2} = \frac{(A_{\beta i}^{2} V_{od}^{2} + 4A_{thi}^{2}) I_{di}^{2}}{W_{i} L_{i} V_{odi}^{2}} \propto \frac{I_{di}^{2}}{W_{i}} \propto I_{di}$$
(6-38)

$$I_{os,out}^{2} = I_{os1}^{2} * (1 + k_{3}^{2}) + I_{os3}^{2} (k_{3} + k_{3}^{2}) + I_{os5}^{2} (2 + k_{3}^{2})$$
(6-39)

$$V_{os,prop}^{2} = \frac{I_{os1}^{2} * \left[ (1 + k_{3}^{2}) + c * (k_{3} + k_{3}^{2}) + d * (2 + k_{3}^{2}) \right]}{[0.5 * g_{m1} * (k_{3} + 1)]^{2}}; c = \frac{I_{os3}^{2}}{I_{os1}^{2}}; d = \frac{I_{os5}^{2}}{I_{os1}^{2}}$$
(6-40)

$$V_{\rm os,conv}^2 = \frac{2(I_{\rm os1}^2 + I_{\rm os3,conv}^2 + I_{\rm os5,conv}^2)}{g_{\rm m1}^2} = \frac{2I_{\rm os1}^2}{g_{\rm m1}^2} (1 + c * \frac{r + 0.5}{2\alpha + 0.5} + d * \frac{r}{2\alpha})$$
(6-41)

#### **6.5. Simulation Results**

In order to confirm the effectiveness and robustness of the performance improvement brought by the proposed FCA, two design examples are implemented in the 180nm CMOS process. The first design example is the conventional (conv.) FCA shown in Figure 5.5(b). The second design example is the proposed (prop.) FCA shown in Figure 6.1. Extensive simulations, under process corner variations and process corner plus mismatch variations are conducted to compare the two design examples. The purposes of the simulations are fourfold: a) to verify that the proposed FCA largely improves the FCA's CUE; b) to verify that the proposed FCA largely improves the FCA's DC gain; c) to verify that the proposed FCA largely



improves the FCA's SR; and d) to confirm the compatibility of the proposed gain, SR and CUE enhancement techniques.

All the simulation results below are collected with the design examples placed in a noninverting unity gain buffer configuration with a load capacitor of 1pF and supply voltage of 1.8V. The nominal bias currents of the proposed and conventional op amp are respectively  $3.5\mu$ A and  $2.58\mu$ A but the op amps' tail currents are the same at  $1.5\mu$ A.

#### 6.5.1 Typical corner simulation results

#### 6.5.1.1 Frequency response

The frequency responses of the proposed and conventional fast FCAs are shown in Figure 5.14. The proposed FCA's GBW, 2.4MHz, is slightly higher than that of the conventional FCA, 2.0MHz, given that the size ratio of transistor M3 to transistorM4 is 1.25, slightly larger than 1. The phase margin (PM) of the proposed and conventional FCAs are 75.5° and 70° respectively, which match well with the theoretical calculations. The slight PM difference is caused by a much lower bias current in the proposed FCA's cascode stage. In the two design examples, the cascode stage's bias currents in the proposed and conventional FCA are respectively 0167 times and 0.67 times of I<sub>tail</sub>. In addition, the DC gain of the proposed FCA is about 20dB higher than the conventional FCA. The DC gain of the proposed and conventional op amps are about 103.8dB and 83.5dB respectively. The DC gain enhancement in the proposed FCA is brought by both the gain enhancement circuit on NMOS side and the smaller bias current in the cascode stage.





Figure 6.8: Frequency responses of the proposed and conventional FCAs

# 6.5.1.2 Noise performance

The simulated noise performance of the proposed and conventional FCA are shown in Figure 6.9. As expected, the proposed FCA has lower noise floor than the conventional FCA. For example, the voltage noise density of the proposed FCA at 100KHz is 73.47nV/sqrt(Hz), while the voltage noise density of the conventional FCA at 100KHz is 88.4nV/sqrt(Hz). The noise reduction of the proposed FCAs is a natural byproduct of the bias current reduction in the cascode stage. The total integrated noise voltage from 0.01Hz to 2MHz for the proposed and conventional FCAs are respectively 99.24  $\mu$ V and 127.4 $\mu$ V. That is to say, compared with the conventional FCA, the proposed FCA reduces noise by 22%.





Figure 6.9: Noise performance of the proposed and conventional FCAs

# 6.5.1.3 Transient response

Figure 6.10 shows the step responses of the two FCAs with an input step voltage of 0.6V. As expected, the positive slew rate (SR+) of the proposed FCA is faster than the conventional FCA due to the new turn-around stage. The positive and negative slew rate (SR+ and SR-) of the proposed FCA are respectively  $SR_{+prop} = +5.84V/\mu s$  and  $SR_{-prop} = -5.1V/\mu s$ , whereas those of the conventional FCA are respectively  $SR_{+conv} = +1.1V/\mu s$  and  $SR_{-conv} = -1.34V/\mu s$ . The positive and negative SR improvement brought by the proposed FCA are 5.3 times and 3.8 times. The average SR improvement of the proposed FCA is 4.6 times. The simulated SR+ improvement is slightly higher than the calculated improvement factor of 4, due to length modulation effects of the current mirror M14-M15. The simulated SR- improvement matches very well with the theoretical calculations.





Figure 6.10: Transient responses of proposed and conventional FCAs In addition, the settling times for the two FCAs are respectively 0.39µs and 0.75µs with a settling accuracy of 0.1% (Ts\_0.1%) and 0.54µs and 1.08µs with a settling accuracy of 0.01% (Ts\_0.01%). Therefore, the average Ts\_0.1% and Ts\_0.01% of the proposed FCA are shorter than those of the conventional FCA by 48% and 50% respectively. The simulated settling time of the two FCAs also matches the theoretical calculation for settling accuracy of 0.1% (7/GBW) and 0.01% (9/GBW) in a first-order system. In fact, the proposed FCA's settling times are slightly faster than the calculated settling times due to the low-gain positive feedback loop of the GE circuit in the FCA. This confirms that a long recovery time is not needed in the proposed FCA despite that its cascode bias current is much smaller than its tail current.

# 6.5.1.4 Performance summary for typical corner simulation

The performance of the two design examples are summarized in Table 6.2. As can be seen, compared with the conventional FCA, the proposed FCA reduces its total current waste in its cascode stage by a factor of 2, which results in a 27% decrease in its total power consumption.



The CUE of the proposed and conventional FCAs are respectively 58% and 42%, so the proposed FCA increases the CUE by 1.38 times. In addition to its advantage in reducing required supply current, the proposed FCA enhances the average SR by 4.6 times and reduces Ts\_0.1% and Ts\_0.01% by 48% and 50% respectively. Moreover, compared with the conventional FCA, the proposed FCA's input referred voltage noise density at 100KHz and integrated noise from 0.1Hz to 2MHz are reduced by 13% and 22% respectively.

| Output                                              | Unit        | Prop.      | Conv. |
|-----------------------------------------------------|-------------|------------|-------|
| GBW                                                 | MHz         | 2.38       | 2.04  |
| РМ                                                  | degree      | 75.5       | 74    |
| DC Gain                                             | dB          | 103.8      | 83.5  |
| Isupply                                             | μA          | 2.58       | 3.5   |
| Iwaste                                              | μA          | 1.08       | 2     |
| Itail                                               | μA          | 1.5        | 1.5   |
| Iwaste/Itail                                        | %           | 71.77      | 133.5 |
| CUE (Itail/Isupply)                                 | %           | 58         | 42    |
| SR_avg                                              | V/µs        | 5.46       | 1.2   |
| Ts_0.1% @Vstep=0.6V                                 | μs          | 0.39       | 0.75  |
| Ts_0.01% @Vstep=0.6V                                | μs          | 0.54       | 1.08  |
| Vni @ 100KHz                                        | nV/sqrt(Hz) | 73.47      | 88.4  |
| Vno integrated to 2MHz                              | μV          | 99.24      | 127.4 |
| FOMs                                                | pF*MHz/µA   | 0.93       | 0.56  |
| FOML                                                | pF*V/µA-µs  | 2.12       | 0.35  |
| FOM <sub>Ts_0.1%</sub>                              | pF/µA-µs    | 0.99       | 0.38  |
| FOM <sub>Ts_0.01%</sub>                             | pF/µA-µs    | 0.72       | 0.26  |
| FOM <sub>noise</sub> (total noise/input pair noise) | $(V/V)^2$   | 2.78       | 5     |
| CL                                                  | pF          | 1          | 1     |
| Supply Voltage                                      | V           | 1.8        | 1.8   |
| Process                                             |             | 180nm CMOS |       |

Table 6.2: Performance summary of the prop. and conv. FCAs in typical corner

As a result, compared with the conventional FCA, the proposed FCA improves the small signal figure of merit (FOM<sub>s</sub>) and the large signal figure of merit (FOM<sub>L</sub>) by 66% and 6.6



times respectively. Recalling from the noise and settling time figure of merits defined in Section 5.4.1.4, the two figure of merits are rewritten as (6-43) and (6-44). In (6-43), Ts\_x% is the settling time of the FCA with x% settling accuracy in a noninverting unity gain buffer configuration and the value of x can be 1, 0.1, 0.01 and 0.001 depending on the targeted application's settling accuracy requirement. I<sub>supply</sub> and C<sub>L</sub> are respectively the supply current and load capacitor of the FCA. Compared with the conventional FCA, the proposed FCA's FOM<sub>Ts\_0.1%</sub> and FOM<sub>Ts\_0.01%</sub> are improved by 2.6 times and 2.8 times respectively. In addition, the proposed FCA's FOM<sub>noise</sub> is improved by 1.8 times.

$$FOM_s = \frac{GBW * C_L}{I_{supply}}$$
;  $FOM_L = \frac{SR * C_L}{I_{supply}}$  (6-42)

$$FOM_{Ts_x\%} = \frac{Ts_x\% * C_L}{I_{supply}} , x = 1, 0.1, 0.01 ...$$
(6-43)

$$FOM_{noise} = \frac{V_{ni,total}^2}{V_{ni,input pair}^2}$$
(6-44)

#### 6.5.2 Process corner and temperature variation simulation results

In this section, the designed two FCAs are simulated under process corner and temperature (P.T.) variations from -40°C to 85°C. The purposes of the simulations are twofold: a) verify the robustness of the proposed FCA under P.T. variations; and b) confirm the advantages of the proposed FCA under P.T. variations. The simulated performance of the designed FCAs include frequency response, transient response and noise performance because these performance are affected by P.T. variations. The independent process corner variations and temperature variations are listed in Table. 6.3. In total, there are 25 simulation setups including 1 typical corner and 24 combinations of P.T. variation.



|              | Typical | Corners              |
|--------------|---------|----------------------|
| Temperature  | 27°C    | -40°C, 27°C and 85°C |
| Low Vth MOS  | tntp    | snsp, snwp,wnsp,wnwp |
| High Vth MOS | tntp    | snsp, wnwp           |

 Table 6.3: Simulation setup with process corner and temperature variation

### 6.5.2.1 Frequency Response



Figure 6.11: Frequency responses of the two FCAs a) proposed b) conventional Figure 6.11 shows the frequency responses of the proposed and conventional FCA under P.T. variations. The GBW and PM of the two FCAs have very little variation. The (min, typ, max) of the proposed FCA's simulated DC gain, phase margin (PM) and GBW are respectively (102B, 104dB, 104dB), (73.7°, 75.8°, 77.4°) and (2.03MHz, 2.39MHz, 2.65MHz). On the other hand, the (min, typ, max) of the conventional FCA's simulated DC gain, phase margin (PM) and GBW are respectively (80dB, 83.5dB, 83.5dB), (73.7°, 74°, 74.5°) and (1.7MHz, 2.0MHz, 2.2MHz). The variations of DC gain, PM and GBW of both the conventional and proposed FCAs are small. The amount of DC gain enhancement is maintained to be about 20dB under


P.T. variations. Compared with the proposed FCA in Chapter 5, the leakage current of transistor M14 in this proposed FCA has been minimized. This is the reason why the proposed FCA can maintain large DC gain enhancement in the corner of fast NMOS when T=85°C.

#### 6.5.2.2 Noise Performance



Figure 6.12: Noise performance of the prop. and conv. FCAs under P.T. variation The simulated noise performance of the proposed and conventional FCA under P.T. variations are shown in Figure 6.12. The input referred voltage noise densities of the proposed and conventional FCAs are respectively 60.2~94.3nV/sqrt(Hz) and 72.3~115nV/sqrt(Hz) at a frequency of 100KHz. The integrated noise from 0.01Hz to 2MHz of the proposed and conventional FCAs are respectively 81.6~126.7µV and 104.6~161.1µV. Therefore, both the minimum and maximum of the proposed FCA's integrated noise are 21% lower than the



conventional FCA. The noise performance improvement in the proposed FCA is a natural byproduct of a reduced bias current at the cascode stage.

#### 6.5.2.3 Transient Response

The transient responses of the two FCAs are simulated in the noninverting unity gain buffer configuration with an input step voltage of 0.6V under P.T. variations. The simulation results are shown in Figure 6.13. Both the positive and negative slew rates of the proposed and conventional FCAs show a very small spread under P.T. variations. This indicates the robustness of the proposed FCA in its positive slew rate enhancement. This robustness over P.T. variations is expected because the tail current in the positive slewing phase is amplified by a well-defined current gain, and then the amplified current is passed to the load capacitor by the turn-around stage. The mean SRs of the proposed and conventional FCAs range from 0.3~0.45µs and 0.72~0.78µs respectively. This clearly shows advantages of the proposed FCA over the conventional FCA in terms of operation speed. It also shows that unlike the conventional FCA, the proposed FCA does not suffer from any long recovery time under P.T. variations.





Figure 6.13: Transient responses of the prop. and conv. FCAs under P.T. variation

170

## 6.5.2.4 Performance Summary for P.T. Variation

|                         |             | Proposed |       | Conventional |       |       |         |
|-------------------------|-------------|----------|-------|--------------|-------|-------|---------|
| Output                  | Unit        | Min      | Max   | Typical      | Min   | Max   | Typical |
| GBW                     | MHz         | 2.03     | 2.65  | 2.39         | 1.7   | 2.2   | 2.04    |
| PM                      | degree      | 73.65    | 77.35 | 75.8         | 73.7  | 74.5  | 74      |
| DC Gain                 | dB          | 101.9    | 103.8 | 103.8        | 80    | 83.5  | 83.5    |
| SR_avg                  | V/µs        | 4.03     | 6.33  | 5.81         | 1.2   | 1.21  | 1.23    |
| Ts_0.1%                 | μs          | 0.3      | 0.45  | 0.39         | 0.72  | 0.78  | 0.75    |
| Ts_0.01%                | μs          | 0.49     | 0.63  | 0.49         | 0.87  | 1.03  | 0.93    |
| Vni @ 100KHz            | nV/sqrt(Hz) | 60.17    | 94.28 | 73.47        | 72.3  | 113.8 | 88.4    |
| Vno integrated to 2MHz  | μV          | 81.61    | 126.7 | 99.15        | 104.6 | 161.1 | 127.4   |
| FOMs                    | pF*MHz/µA   | 0.78     | 1.03  | 0.93         | 0.48  | 0.63  | 0.56    |
| FOML                    | pF*V/µA-µs  | 1.57     | 2.45  | 2.25         | 0.345 | 0.354 | 0.352   |
| FOM <sub>Ts_0.1%</sub>  | pF/µA-µs    | 0.87     | 1.3   | 0.99         | 0.37  | 0.4   | 0.38    |
| FOM <sub>Ts_0.01%</sub> | pF/µA-µs    | 0.62     | 0.8   | 0.79         | 0.25  | 0.28  | 0.27    |
| Isupply                 | μA          | 2.58     |       | 5            |       |       |         |
| Iwaste                  | μA          | 1.08     |       | 3.5          |       |       |         |
| Itail                   | μA          | 1.5      |       | 1.5          |       |       |         |
| CUE (Itail/Isupply)     | %           | 58.14    |       | 43           |       |       |         |
| CL                      | pF          | 1        |       | 1            |       |       |         |

Table 6.4: Performance summary of the prop. and conv. FCA under P.T. variation

The performance summary of the proposed and conventional FCAs under P.T. variations are shown in Table 6.4. Compared with the conventional FCA, the proposed FCA's output voltage noise, integrated from 0.01Hz to 2MHz, has been reduced by 21%. Ts\_0.1% and Ts\_0.01% of the proposed FCA are also reduced by 48%, while the proposed FCA's supply current is only about 74% of the conventional FCA. In addition, compared with the conventional FCA, the proposed FCA improves the DC gain by about 20dB and well maintains this amount of gain enhancement under P.T. variations. The significant supply current reduction, considerable



settling time reduction, large DC gain enhancement and noise reduction under P.T. variations clearly demonstrate the advantages and robustness of the proposed FCA. This is also evidence that the proposed FCA has a good design compatibility that allows gain, slew rate and current utilization efficiency to be all improved simultaneously.

## 6.5.3 Process corner plus mismatch variation simulation results

In this section, the two designed FCAs are simulated under both process corner and mismatch (P.Mis) variations via the 500-run Monte Carlo simulation. The purposes of the simulations are twofold: a) to verify the robustness of the proposed FCA under P.Mis variations; and b) to confirm the advantages of the proposed FCA under P.Mis variations. The simulated performance discussed in this section are transient response, offset voltage, frequency response, noise and current utilization efficiency. The FOM<sub>s</sub>, FOM<sub>L</sub>, FOM<sub>Ts\_0.1%</sub>, and FOM<sub>Ts\_0.01%</sub> of the FCAs are also reported.



Figure 6.14: Transient responses of the prop. and conv. FCAs under P.Mis. variation





Figure 6.15: Average Ts\_0.01% of the proposed FCA under P.Mis. variation



Figure 6.16: Average Ts\_0.01% of the conventional FCA under P.Mis. variation Figure 6.14 shows the simulated transient responses of the proposed and conventional FCAs under P.Mis variations. Figure 6.15 and Figure 6.16 respectively show the histograms of the average Ts\_0.01% of the proposed and conventional FCAs under P.Mis variations. The average SRs and settling times of both two FCAs show normal distributions. The (mean, sigma) of the proposed FCA's average SR, Ts\_0.1% and Ts\_0.01% are respectively (5.44V/µs, 0.53V/µs), (0.38µs, 0.027µs) and (0.53µs, 0.032µs). On the other hand, the (mean, sigma) of the proposed FCA's average SR, Ts\_0.1% and Ts\_0.01% are respectively (1.23V/µs, 0.024V/µs), (0.75µs, 0.02µs) and (1.08µs, 0.021µs). Therefore, the average improvement of SR, Ts\_0.1% and Ts\_0.01% brought by the proposed FCA are respectively 4.4 times, 2 times



and 2 times of performance of the conventional FCA. This clearly shows the favorable speed performance of the proposed FCA. Most importantly, all the speed improvement brought by the proposed FCA is achieved yet with a smaller power consumption which is about 73.7% of the conventional FCA. In addition, compared with the (mean, sigma) of the conventional FCA's offset voltage being (0.11mV, 3.11mV), those of the proposed FCA are also moderately smaller and are equal to (-0.39mV, 2.26mV).

Therefore, the (mean, sigma) of the proposed FCA's FOM<sub>s</sub>, FOM<sub>L</sub> and FOM<sub>TS\_0.01%</sub> are respectively (1.04 pF\*MHz/ $\mu$ A, 0.08 pF\*MHz/ $\mu$ A), (2.11 pF\*V/ $\mu$ A- $\mu$ s, 0.20 pF\*V/ $\mu$ A- $\mu$ s) and (0.74 pF/ $\mu$ A- $\mu$ s, 0.026 pF/ $\mu$ A- $\mu$ s). The (mean, sigma) of the conventional FCA's FOM<sub>s</sub>, FOM<sub>L</sub> and FOM<sub>TS\_0.01%</sub> are respectively (0.56 pF\*MHz/ $\mu$ A, 0.003 pF\*MHz/ $\mu$ A), (0.35 pF\*V/ $\mu$ A- $\mu$ s, 0.003 pF\*V/ $\mu$ A- $\mu$ s) and (0.266 pF/ $\mu$ A- $\mu$ s, 0.012 pF/ $\mu$ A- $\mu$ s). As can be seen, the average improvement of FOM<sub>s</sub>, FOM<sub>L</sub> and FOM<sub>TS\_0.01%</sub> brought by the proposed FCA are respectively 1.65 times, 6.0 times and 2.8 times of the performance of the conventional FCA.

#### 6.5.3.1 Performance Summary for P.Mis variation

The performance summary of the proposed and conventional FCAs are shown in the Table 6.5. As can be seen, compared with the conventional FCA, the proposed FCA not only reduces power consumption but also improves settling performance under P.Mis variations. In addition, the proposed FCA's DC gain is also largely improved.

Compared with the conventional FCA, the proposed FCA has 27% less supply current, 21% less integrated noise, 28% less offset voltage, 48% less Ts\_0.1% and Ts\_0.01% but 20dB more DC gain. The simultaneous performance enhancement on these critical specifications under P.Mis variations clearly demonstrate the advantages and robustness of the proposed FCA.



|                         |                | Proposed   |       | Conventional |       |
|-------------------------|----------------|------------|-------|--------------|-------|
| Output                  | Unit           | Mean       | Stdev | Mean         | Stdev |
| Vos                     | mV             | -0.39      | 2.26  | 0.11         | 3.11  |
| GBW                     | MHz            | 2.38       | 0.053 | 1.98         | 0.034 |
| PM                      | degree         | 75.47      | 0.345 | 73.8         | 0.64  |
| DC Gain                 | dB             | 103.6      | 0.633 | 82           | 3.4   |
| SR_avg                  | V/µs           | 5.44       | 0.527 | 1.23         | 0.024 |
| Ts_0.1%                 | μs             | 0.38       | 0.027 | 0.75         | 0.02  |
| Ts_0.01%                | μs             | 0.53       | 0.032 | 1.08         | 0.02  |
| Vni @ 100KHz            | nV/sqrt(Hz)    | 73.43      | 0.833 | 88.6         | 1.9   |
| Vno integrated to 2MHz  | μV             | 99.19      | 0.98  | 128          | 2.0   |
| FOMs                    | $pF*MHz/\mu A$ | 0.92       | 0.017 | 0.56         | 0.006 |
| FOML                    | pF*V/µA-µs     | 2.11       | 0.203 | 0.35         | 0.003 |
| FOM <sub>Ts_0.1%</sub>  | pF/µA-µs       | 1.04       | 0.077 | 0.38         | 0.003 |
| FOM <sub>Ts_0.01%</sub> | pF/µA-µs       | 0.74       | 0.045 | 0.265        | 0.004 |
| Isupply                 | μA             | 2.58       | 0.041 | 3.5          | 0.19  |
| Iwaste                  | μA             | 1.08       | 0     | 2.0          | 0.177 |
| Itail                   | μA             | 1.5        | 0.04  | 1.50         | 0.028 |
| Iwaste/Itail            | %              | 71.77      | 0.872 | 134.9        | 11.7  |
| Itail/Isupply           | %              | 58.2       | 0.23  | 42.8         | 0.16  |
| CL                      | pF             | 1.00       | 1.00  | 1.00         | NA    |
| Vsupply                 | V              | 1.8        | NA    | 1.8          | NA    |
| Process                 |                | 180nm CMOS |       |              |       |

Table 6.5: Performance summary of the prop. and conv. FCA under P.Mis variation

## **6.6.Performance comparison to the literature**

Table 6.7 summarizes the performance of the proposed FCA compared with the Chapter 5's proposed FCA, the FCA in [1] and the conventional FCA in a typical corner at room temperature. Compared with Chapter 5's proposed FCA, the FCA in [1] and the conventional FCA, the proposed FCA enhances DC gain by 14dB, 11dB and 20dB respectively, enhances slew rates by 1.5 times, 4.0 times and 4.4 times respectively, reduces Ts\_0.1% by 22%, 54.4% and 48%, decreases Ts\_0.01% by 26%, 66.3% and 50%. As a result, FOM<sub>Ts\_0.1%</sub> of this proposed FCA and Chapter 5's proposed FCA are the same. Their FOM<sub>Ts\_0.1%</sub> are as high as 3



times of [1] and 2.68 times of the conventional FCA. Again, the aforementioned performance comparison clearly demonstrates the advantages of this work (Chapter 6's proposed FCA) over [1] and the conventional FCA. Compared with Chapter 5's proposed FCA, this works shows comparable figure of merits but much higher DC gain, which is a critical specification for high precision system. This work also demonstrates that the proposed gain enhancement, slew rate enhancement and current utilization enhancements techniques in this dissertation can be combined in a single FCA design.

|                         |             | This  | FCA in     |       | Conv. |
|-------------------------|-------------|-------|------------|-------|-------|
| Output                  | Unit        | work  | Chapter 5  | [1]   | FCA   |
| Vos                     | mV          | 2.26  | 2.18       | 2.64  | 2.95  |
| GBW                     | MHz         | 2.38  | 2.14       | 2.2   | 2.04  |
| PM                      | degree      | 75.5  | 70         | 70    | 74    |
| DC Gain                 | dB          | 103.8 | 89.69      | 92.75 | 83.5  |
| Isupply                 | μA          | 2.58  | 1.88       | 2.6   | 3.5   |
| Iwaste                  | μA          | 1.08  | 0.38       | 1.1   | 2.0   |
| Itail                   | μA          | 1.5   | 1.50       | 1.50  | 1.50  |
| Iwaste/Itail            | %           | 71.77 | 25.25      | 73.3  | 133.3 |
| Itail/Isupply           | %           | 58    | 80         | 60    | 43    |
| SR_avg                  | V/µs        | 5.46  | 3.66       | 1.355 | 1.23  |
| Ts_0.1%                 | μs          | 0.39  | 0.50       | 0.855 | 0.75  |
| Ts_0.01%                | μs          | 0.54  | 0.73       | 1.6   | 1.08  |
| Vni @ 100KHz            | nV/sqrt(Hz) | 73.47 | 68.07      | 82.2  | 88.5  |
| Vno integrated to 2MHz  | μV          | 99.24 | 93.20      | 113.0 | 127.5 |
| FOMs                    | pF*MHz/µA   | 0.93  | 1.14       | 0.87  | 0.565 |
| FOML                    | pF*V∕µA-µs  | 2.12  | 1.92       | 0.52  | 0.352 |
| FOM <sub>Ts_0.1%</sub>  | pF/µA-µs    | 0.99  | 1.08       | 0.449 | 0.38  |
| FOM <sub>Ts_0.01%</sub> | pF/µA-µs    | 0.72  | 0.73       | 0.24  | 0.268 |
| FOM <sub>noise</sub>    | $(V/V)^2$   | 2.78  | 2.6        | 3.6   | 5.0   |
| CL                      | pF          | 1.00  | 1.00       | 1.00  | 1.00  |
| Process                 |             | Mean  | 180nm CMOS |       |       |

Table 6.6: Performance comparison of the proposed FCA to the literature



#### 6.7. Discussion

The design example with combined gain, slew rate and current utilization efficiency enhancement techniques shows the advantages of the design over [1] in terms of gain, settling time, power consumption, noise, and offset voltage. The amount of the DC gain enhancement over [1] is 11dB, which leads to an achieved DC gain of about 104dB for a single-stage FCA. The gain enhancement is limited because only the conductance cancellation circuit for the NMOS side of the proposed FCA is implemented as shown in Figure 6.1. If a larger DC gain is needed for an application, a similar gain enhancement circuit can be implemented for the NMOS side of the proposed FCA.

## 6.8.Summary

A design example with a combination of enhancement techniques for gain, slew rate and current utilization efficiency has been introduced. Compared to the state-of-the-art method [1], the proposed FCA increases DC gain by 11dB, improves slew rate by 4.04 times, and reduces settling time with 0.1% and 0.01% settling accuracy by 2.2 and 2.96 times respectively. Due to its design simplicity, high current utilization efficiency, low noise and offset voltage, the proposed FCA is suitable for applications and systems where FCAs are used as single-stage amplifiers or first stages in multi-stage amplifiers. The applications include but not limited to battery monitoring circuits, load current sensing circuits, data converters and switched-capacitor circuits.

## **6.9.References**

 R. Eschauzierand and NV. Rijn. "Apparatus and method for a compact class AB turnaround stage with low noise, low offset, and low power consumption," U.S. Patent No. 6,624,696. 23 Sep. 2003.



# CHAPTER 7. CONCLUSION

In this research, a series of performance enhancement techniques for operational amplifiers are introduced including techniques for gain enhancement, slew rate enhancement, current utilization efficiency enhancement and power efficiency enhancement.

In Chapter 2, a new method to robustly improve an op amp' DC gain with negligible power and area overhead via conductance cancellation has been introduced. The uniqueness of this gain enhancement technique lies in its robust ability to track and cancel conductance under PVT variations without the aid of any tuning circuit. Because of this unique capability, the proposed method can bring out over 20dB enhanced DC gain that well sustains under PVT variations. Compared with the regulated gain boosting technique, the proposed method offers several benefits. First, it does not degrade an op amp's settling performance including its high precision settling. Second, the design and simulation effort involved in the design is minimal, whereas contrastively the regulated gain boosting technique needs significant amount of design and simulation efforts to address instability and pole-zero doublet issues. Third, the power and area consumption of the proposed gain enhancement technique are very low. Due to its design simplicity, low power and low area cost with no degradation of an op amp's settling time, this proposed technique is suitable for op amps in high precision systems such as switched capacitor circuits, ADC drivers and filters.

In Chapter 3, we have introduced a new slew rate enhancement (SRE) circuit, which can largely improve an amplifier's slew rate via excessive transient feedback in the slewing phases while preserving the amplifier's small-signal performance through a well-defined turn-on condition. This nonlinear operation of the introduced SRE circuit improves the linearity of the entire amplifier. In addition, the transient current efficiency of the proposed SRE method is



also high in the slewing phases because the increased transient tail current always improves the amplifier's slew rate regardless of whether the transient tail current functions as commonmode or differential-mode for the amplifier's input pair. Due to the little power consumption, low area overhead, design simplicity and high effectiveness of the proposed SRE method, the method is suitable for applications which need to provide large capacitive driving capabilities with low static power dissipation such as switched capacitor circuits.

In Chapter 4, we have introduced a power-efficient design method for an op amp to drive very large capacitive loads. The proposed method has several advantages compared with the state-of-the-art methods for driving large capacitive loads. First, the proposed method decouples large- and small-signal paths so that both the small- and large-signal performance of the op amp can be optimized simultaneously. Second, the designed op amp with the proposed method has a well-defined quiescent current. As a result, the designed op amp is not sensitive to devices' random mismatches. Third, the amount of wasted current in the preamp's load circuit is minimized to zero. Due to these three advantages, the designed op amp is able to offer favorable small-signal and large-signal figure of merits simultaneously. This is an important improvement compared with the state-of-the-art methods which can only improve small-signal figure of merits at the cost of large-signal figure of merits or vice versa. This proposed power-efficient op amp design is suitable for applications where large capacitive loads need to be driven, such as LCD buffers and electro-chemical sensors.

In Chapter 5, a new technique that improves the current utilization efficiency (CUE) of a folded cascode amplifier (FCA) has been introduced. Compared with the state-of-the-art techniques, the proposed method provides several benefits. First, the dependency of the FCA's nondominant poles and phase margin on the bias current of the FCA's cascode stage is largely



relaxed. Therefore, the proposed FCA can significantly reduce current consumption in the cascode stage. Second, the proposed method does not suffer from a long recovery time after the slewing phases complete, though the cascode stage's bias current is as low as 8.3% of the FCA's tail current. Third, the proposed method does not need any frequency compensation. As a result, the design simplicity and area consumption of the designed FCA is significantly reduced. In addition, compared with the conventional FCA design and the state-of-the-art method which improve settling time only with increased power consumption, the designed FCA achieves faster settling time with decreased power consumption. Therefore, the proposed CUE enhancement technique is suitable for applications and systems where a FCA is used as single-stage amplifiers or the first stage in multi-stage amplifiers. The applications include but not limited to battery monitoring circuits, load current sensing circuits, data converters, and switched-capacitor circuits.

In Chapter 6, we have presented a designed FCA with gain, slew rate and current utilization efficiency enhancement techniques combined. The designed FCA confirms the compatibility of the proposed performance enhancement techniques in Chapter 2, 3 and 5. Compared with the conventional FCA, the design example shows multiple performance improvement simultaneously including power consumption, gain, slew rate, and settling time. As natural byproducts of power consumption reduction, the offset voltage and noise of the designed FCA are also decreased. Therefore, the designed FCA can be used for applications where wide input common mode range, high gain, fast settling, low noise and low offset are needed such as pipeline ADC's sample-and-hold circuits and sigma-delta ADCs.

